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Abstract 

In  both  the  Air  Eorce  and  Search  and  Rescue  Communities,  there  is  a  current  need  to 
detect  and  characterize  persons.  Existing  methods  use  red-green-blue  (RGB)  imagery,  but 
produce  high  false  alarm  rates.  New  technology  in  multi-spectral  skin  detection  is  better  than  the 
existing  RGB  methods,  but  lacks  a  control  and  processing  architecture  to  make  them  efficient  for 
real  time  problems.  We  hypothesize  that  taking  a  minimalistic  approach  to  the  software  design, 
we  can  perform  image  preprocessing,  feature  computation,  and  skin  detection  in  real  time. 

A  number  of  applications  require  accurate  detection  and  characterization  of  persons, 
human  measurement  and  signature  intelligence  (H-MASINT),  and  SAR  in  particular. 
H-MASINT  requires  it  for  the  detection  of  persons  in  images  so  other  processing  can  be 
performed.  It  is  useful  in  the  SAR  community  as  a  method  of  finding  persons  partly  obscured,  in 
remote  regions,  and  either  living  or  deceased. 

We  have  developed  a  modular  computing  architecture  to  perform  the  acquisition  and 
processing  in  real  time,  as  well  as  separate  programs  to  perform  processing  and  analysis  of 
images  post-acquisition.  The  architecture  is  flexible,  as  one  can  easily  add  additional 
functionality  to  meet  growing  demands.  All  programs  were  organized  using  a  basic  Model- 
View-Controller  design,  designed  using  Universal  Modeling  Language  principles,  and  coded 
using  a  bottom-up  approach. 

Based  on  the  results  presented  in  this  thesis,  image  acquisition,  processing,  skin 
detection,  viewing,  and  saving  can  be  performed  in  real  time,  at  nearly  10  fps.  Not  only  does  this 
support  the  SAR  community,  the  Air  Eorce  now  has  a  new  capability  to  help  address  its 
H-MASINT  mission. 
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FLEXIBLE  COMPUTING  ARCHITECTURE 


EOR 

REAL  TIME  SKIN  DETECTION 

I.  Introduction 

1.1  Background 

Recent  research  has  produced  an  engineering  model  of  human  skin  [1],  Erom  evaluating 
the  output  of  this  model,  the  author  was  able  to  develop  an  efficient  skin  detection  methodology 
using  two  distinct  frequencies  in  the  near  infrared.  Additional  studies  showed  that  a  majority  of 
the  surrounding  environment  could  be  suppressed  using  information  contained  in  the  visible 
wavelengths. 

Eigure  1  shows  an  example  of  the  capabilities  of  this  detection  strategy.  This  detection 
strategy  works  regardless  of  the  color  of  the  skin,  or  if  the  subject  is  alive  or  dead  (depending  on 
the  condition  of  the  body).  Since  it  does  not  rely  on  shape  based  recognition,  there  is  no 
requirement  for  the  entire  person  to  be  exposed,  just  a  single  pixel  of  skin.  The  same  body  of 
work  also  presented  the  capability  to  estimate  the  melanin  content  of  skin  detected  pixels, 
providing  a  mechanism  to  characterize  the  detected  skin.  An  example  of  this  is  shown  in  Eig.  2. 
Due  to  recent  work  in  [2],  the  Air  Eorce  now  has  a  capable  multispectral  imaging  system 
specialized  for  the  task  of  skin  detection.  What  is  currently  missing  is  an  architecture  that 
controls  the  cameras  and  processes  the  images  from  the  multispectral  imaging  system  in  [2]. 
This  thesis  addresses  that  current  deficiency. 
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Figure  1,  Example  of  skin  detection  using  hyperspectral  imagery  from  the  HyperSpecTIRS 
(HST3)  hyperspectral  camera  [1]. 


Figure  2.  Example  of  melanin  estimation  on  skin  detections  [1]. 
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1.2  Research  Scope 

The  goal  of  this  thesis  is  to  ereate  a  eompact  software  architeeture  to  eontrol  diverse 
eamera  hardware  interfaees  and  to  provide  a  fast  and  flexible  eomputing  platform  that  allows  the 
ereation,  ineorporation,  and  testing  of  new  detection  methodologies  and  algorithms,  that  is 
compatible  with  the  multispectral  system  developed  in  [2],  The  hypothesis  is  that,  by  taking  a 
minimalistic  approach  to  the  design,  we  can  perform  image  preprocessing,  feature  computation, 
and  skin  detection  in  real  time.  Furthermore,  this  system  should  provide  a  platform  capable  of 
operating  in  real-life  scenarios. 

Our  eventual  goal  is  to  create  a  mobile  system  for  use  in  a  variety  of  environments.  The 
primary  use  will  be  to  have  a  small  aircraft  mounted  system  for  use  in  search  and  rescue.  This 
environment  requires  a  system  to  operate  fast  enough  to  obtain  complete  image  ground  cover 
while  at  the  moving  at  high  speeds,  and  also  work  with  changing  aspect  and  distances,  as  well  as 
size  and  weight  restrictions. 

The  hyperspectral  cameras  used  in  [1]  are  large,  expensive,  and  are  not  nearly  fast 
enough  to  provide  real-time  imagery,  or  even  complete  ground  coverage  for  a  fast,  low  flying 
aircraft.  The  system  developed  in  [2]  uses  near  infrared  cameras  with  filters  to  obtain  the  two 
necessary  frequencies  for  skin  detection.  False  alarm  suppression  is  handled  with  a  color 
camera,  while  melanin  estimation  is  planned  for  future  work,  and  uses  a  broadband 
monochromatic  camera. 
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1.3  Methodology 

The  software  is  built  using  a  bottom  up  approaeh.  UML  modeling  is  utilized  to  aid  in 
development.  Testing  is  aeeomplished  through  field  testing.  Runtime  analysis  of  the  algorithms 
and  eameras  is  performed  on  the  end-to-end  system  (hardware  and  software). 

1.4  Resources 

The  software  with  both  the  National  Instruments  eards,  (PCI  interfaee)  and  the  Imperx 
(Express54  interfaee)  Camera  Link  frame  grabbers  provided  drivers  and  software  for  both  C++ 
and  Matlab.  Due  to  the  flexibility  Matlab  offers  to  the  research  group,  it  was  chosen  as  the 
language  of  choice  for  image  acquisition.  However,  since  higher  level  programming  is  difficult 
and  runs  slow  in  Matlab,  Java  was  chosen  to  implement  the  architecture  developed  in  this  thesis 
(Java  reference  materials  we  use  include  [3,4,5]).  The  optical  system  used  in  this  thesis  was 
developed  in  [2],  and  the  algorithms  in  [1,  6]. 

1.5  Thesis  Organization 

Chapter  1  covers  the  introduction  and  basic  goals  of  the  thesis.  It  defines  the  boundaries 
and  focus  of  the  thesis. 

Chapter  2  discusses  the  background  necessary  to  understand  the  methodology.  It  begins 
by  discussing  the  basics  of  the  imagery  and  human  skin  model.  Next  the  algorithms  used  for 
skin  detection  in  this  thesis  are  presented.  Finally  the  basics  of  UML  modeling  are  covered. 

The  methodology  is  covered  in  Chapter  3.  The  overall  program  layout  is  discussed,  and 
the  design  process  for  each  piece  of  software  is  discussed.  First,  the  requirements  and 
functionality  are  laid  out,  and  then  the  sequence  diagrams  of  the  method  calls  required  to  make 
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each  function  happen.  Next,  class  diagrams  are  developed.  Finally,  this  chapter  describes  the 
video  file  and  parameter  file  format  as  well  as  the  parameter  definitions. 

Chapter  4  covers  the  testing  and  analysis  of  the  software.  First,  all  known  errors  and 
flaws  are  discussed.  Next,  a  demonstration  of  the  system  is  provided.  An  effort  is  taken  to  show 
the  output  of  each  algorithm  in  order  to  show  that  it  has  been  correctly  implemented.  The  results 
are  discussed  and  analyzed.  Finally  this  chapter  analyzes  the  performance  of  the  cameras  and  the 
algorithms,  both  from  a  mathematical  perspective,  and  time-based  comparison. 

Chapter  5  provides  a  summary  of  the  present  work.  It  highlights  the  important  aspects  of 
the  methodology  and  results,  and  draws  the  necessary  conclusions.  Finally,  the  chapter  discusses 
the  possibility  of  future  work. 

Finally,  Appendix  A  is  the  required  software  and  the  steps  that  must  be  taken  to  set  up 
and  use  the  software,  and  Appendix  B  is  the  User’s  Manual  for  the  software. 
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11.  Background 


This  chapter  discusses  the  basies  of  the  imagery  used  for  this  researeh.  We  then  diseuss 
the  basie  human  skin  model,  and  the  algorithms  used  to  proeess  the  images.  This  ehapter  also 
gives  the  basie  UML  modeling  techniques  used  to  develop  the  software. 

2. 1  Hyperspectral  Imaging 

Standard  eolor  eameras  take  images  at  three  different  wavelengths  of  light,  eorresponding 
to  red,  green,  and  blue.  The  bandwidths  of  these  three  wavelengths  are  generally  on  the  order  of 
100  nm  wide.  Hyperspeetral  eameras  take  an  image,  but  often  at  several  hundred  different 
wavelengths,  ranging  from  the  visible  through  the  near  infrared,  typieally  400-2500  nm.  The 
bandwidth  at  eaeh  of  these  wavelengths  is  generally  on  the  order  of  10  nm  wide.  This  additional 
information  allows  analysts  to  examine  the  refleetance  properties  of  the  imaged  materials. 

A  standard  hyperspeetral  image  ean  be  thought  of  as  a  eube,  where  the  X-Y  plane  is  the 
two  dimensional  image,  and  the  Z  plane  is  the  wavelength.  Alternatively,  examining  a  single 
pixel  through  the  height  of  the  eube  gives  us  the  magnitude  of  the  refleetion  aeross  the  speetrum 
for  that  material.  Figure  3  shows  the  eonstruetion  of  a  hyperspeetral  cube  in  eomparison  to  a 
multispeetral  image  [7]. 

Hyperspeetral  imagery  eontains  a  lot  of  information  about  the  material  being  imaged. 
Different  materials  respond  differently  at  various  wavelengths,  and  a  analyzing  a  single  pixel 
throughout  a  hyperspeetral  cube  provides  us  with  important  information  regarding  the  material 
eontent  in  the  image  at  that  pixel.  These  eurves  ean  be  used  to  discriminate  between  multiple 
elasses  of  materials  in  images. 
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Figure  3,  Comparison  of  a  hyperspectral  cube  to  a  standard  multispectral  image  [7]. 


2.2  Human  Skin  Reflectance 

Previous  research  [1,  2,  6,  8]  developed  a  multispeetral  approach  to  skin  detection,  and 
demonstrated  using  hyperspectral  imagery.  We  use  broadband  monochrome  cameras  that  take 
images  in  the  near  infrared  portion  of  the  electromagnetie  speetrum.  We  plaee  a  filter  in  front  of 
each  camera,  so  the  image  is  only  of  a  very  partieular  bandwidth,  about  1 0?  nm  wide.  By  using 
two  near  infrared  cameras  with  filters,  a  standard  color  camera,  and  a  monoehrome  camera  in  the 
visible  portion  of  the  speetrum,  we  can  create  a  composite  image  with  six  distinct  wavelengths 
[2].  For  the  purpose  of  this  research,  these  wavelengths  are  1580  and  1080  nm  in  the  near 
infrared,  750  nm  for  just  beyond  the  visible,  and  660,  540,  and  475  nm  in  visible. 

A  model  of  human  skin  reflection  was  developed  in  [1].  Figure  4  shows  an  example  skin 
refieetance  spectrum  generated  by  that  model.  The  lines  show  light  skin  (2.4%  melanin)  and 
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dark  skin  (24%  melanin).  Darker  skin  has  redueed  reflectanee  at  the  shorter  wavelengths  due  to 
melanin.  The  basic  shape  remains  the  same  beyond  1000  nm,  due  primarily  to  water  absorption. 


Figure  4,  Human  Skin  Model  showing  the  effect  of  melanin  on  reflectance  [1].  The  solid  line  is 
a  plot  of  skin  reflectance  with  2.4%  melanin,  and  the  dashed  line  of  24%  melanin. 

For  the  purpose  of  this  research,  the  important  features  are  the  high  reflectance  at 
approximately  1080  nm,  and  the  low  reflectance  around  1580  nm.  The  significantly  lower 
reflection  at  the  longer  wavelengths  is  due  to  water  absorption.  However,  other  water-heavy 
materials  also  tend  to  share  this  same  drop  in  reflectance,  which  can  cause  problems  as  discussed 
later.  Figure  5  shows  the  difference  of  skin  reflectance  between  1580  nm  and  1080  nm.  We  use 
1580  nm  instead  of  1400  nm  because  there  is  a  large  amount  of  atmospheric  absorption  around 
1400  nm,  and  therefore  very  little  natural  illumination  at  that  wavelength  [1]. 
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Figure  5,  Filtered  NIR  images  showing  the  difference  in  skin  reflectance  between  1580  nm 
(top)  and  1080  nm  (bottom). 


Another  important  feature  is  the  difference  between  the  reflectance  in  the  red  (660  nm) 
and  green  (540  nm)  portion  of  the  spectrum.  Skin  is  significantly  more  red  than  it  is  green, 
whereas  most  common  skin  confuses  are  typically  either  more  green  than  red  (i.e.  vegetation)  or 
approximately  equal  (i.e.  snow)  in  reflectance. 

Figure  6  shows  the  difference  between  the  skin  spectra  and  typical  skin  “confusers”. 
Skin  colored  plastics  and  cardboard  have  very  similar  reflectance  in  the  visible  spectrum,  which 
cause  color  based  approaches  to  fail.  However  there  are  clear  differences  in  the  near  infrared 
wavelengths  that  allow  our  methods  to  discriminate  between  the  materials. 
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Figure  6,  Spectra  of  Type  I/II  (light)  and  Type  III/IV  (dark)  skin  (dashed  and  dotted  lines 
respectively)  and  spectra  of  a  plastic  doll  and  brown  cardboard  (solid  and  dashed-dotted 
respectively)  [1]. 


2.3  Features  for  Skin  Detection  and  False  Alarm  Suppression 

For  the  remainder  of  this  thesis,  Pwaveiength  is  the  estimated  reflectance  in  a  given  pixel, 
calculated  from  the  measured  reflectance  of  light  and  dark  panels,  using  the  empirical  line 
method  (ELM)  [6]. 

2.3. 1  Normalized  Difference  Skin  Index 

The  Normalized  Difference  Skin  Index  (NDSI),  developed  in  [1],  describes  a  method  for 
skin  detection  by  examining  the  difference  between  1080  nm  and  1580  nm.  The  NDSI  value  y  is 
found  by 

PlOBO  “  Pl580 

Y  =  -  (1) 

PlOBO  +  PlSBO 
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where  Pioso  Pisso  are  the  estimated  refleetanees  of  the  1080  and  1580  nm  wavelengths, 
respeetively. 

2.3.2  Normalized  Difference  Vegetation  Index 

Reasonable  performing  skin  deteetion  algorithms  eannot  be  aeeomplished  using  NDSl 
alone.  Even  though  it  may  be  possible  to  identify  nearly  all  pixels  in  an  image  that  eontain  skin, 
it  will  also  falsely  identify  several  other  materials  as  skin.  Common  eonfusers  for  skin  detection 
using  only  NDSl  include  some  types  of  vegetation,  snow,  mud,  or  a  raw  steak.  Therefore, 
additional  features  must  be  used  to  remove  these  common  false  positives  from  the  detections. 

The  equation  to  detect  vegetation,  thus  enabling  it  to  be  ruled  out  from  skin  detections,  is 
shown  in  Eqn.  2,  where  a  is  the  value  of  the  Normalized  Difference  Vegetation  Index  (NDVI). 

P750  “  P660 
P750  P660 

In  Eqn.  2,  p^gg  and  pygg  are  the  estimated  reflectances  of  the  660  and  750  nm  wavelengths, 
respectively.  As  can  be  seen  from  Eqn.  2,  this  feature  depends  on  the  addition  of  the  750nm 
camera.  It  is  currently  incorporated  into  the  architecture  but  disabled. 

2.3.3  Normalized  Difference  Green-Red  Index 

Another  feature  useful  for  ruling  out  false  positives  is  the  Normalized  Difference  Green- 
Red  Index  (NDGRI).  The  assumption  is  that  skin  is  more  red  than  green,  while  most  other  skin 
eonfusers  are  not  [1].  This  feature  is  especially  useful  in  ruling  out  vegetation  and  snow.  The 
NDGRI  is  defined  as: 

n  P540  “  Peeo 

P  =  - - — —  (3) 

P540  +  P660 

where  P540  and  pggg  are  the  estimated  reflectances  at  540  and  660  nm,  respectively. 
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2.4  Algorithms  for  Skin  Detection 

Currently,  three  algorithms  for  skin  detection  are  specified  [1,6]  in  our  system.  The 
Basic  Skin  Detector  only  uses  the  NDSI  values,  while  the  rules  based  detectors  use  a  rectangular 
bound  on  either  (NDSI,  NDVI)  or  (NDSI,  NDGRI)  pairs.  The  last  detection  algorithm  is  the 
likelihood  ratio  test,  which  uses  a  continuous  function  as  the  decision  boundary  (basically  a 
bounding  polygon). 


CO 

Q 

Z 


NDVI  (a) 


NDGRI  0) 


(a)  (NDSI/NDVI)  Pair 


(b)  (NDSI/NDGRI)  pair 


Figure  7,  (a)  Joint  distribution  of  NDVI  and  NDSI  values  using  spectral  measurements  and  skin 
model-generated  data,  (b)  Joint  distribution  of  NDGRI  and  NDSI  values  using  spectral 
measurements  and  skin  model-generated  data.  Spectral  measurements  of  a  random  sampling  of 
materials  are  shown  as  red,  and  skin  model-generated  data  are  shown  as  black  [6]. 


2. 4. 1  Basic  Skin  Detector 

The  most  basic  detector,  as  discussed  above,  uses  only  the  NDSI  values.  This  also  means 
the  detector  can  be  run  using  only  two  cameras.  However,  because  it  will  generate  a  large 
number  of  false  positives  it  is  not  particularly  useful.  The  NDSI  values  that  are  probably  skin  are 
0.657<y<0.768  [8]. 
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2.4.2  Rules  Based  Skin  Detectors 

Indentifying  a  particular  pixel  as  skin  should  not  be  based  on  the  NDSI  alone,  as  it  results 
in  too  many  false  positives.  Two  rules  based  algorithms  are  shown  in  Eqn.  4  and  Eqn.  5. 
Equation  4  shows  the  rules  for  determining  if  a  pixel  is  skin  based  on  NDVI  and  NDSI,  where 
the  eonstants  ai  and  02  are  the  bounds  for  NDVI.  The  default  values  are  0.004  and  0.503,  but 
may  change  depending  on  the  exact  environment  [8].  The  constants  c;  and  C2  are  the  bounds  for 
NDSI,  with  defaults  at  0.657  and  0.768  [8].  If  the  result  S  for  any  partieular  pixel  is  1,  then  it 
should  be  eonsidered  skin,  otherwise  it  should  not.  Beeause  this  equation  depends  on  the  NDVI 
feature,  whieh  depends  on  the  750  nm  eamera,  it  is  fully  ineorporated  into  the  arehiteeture  but 
disabled. 


^  _  rl  if  %  <  a  <  a2  and  Ci  <  y  < 
to  othorwisfi 


C2 


(4) 


-0  otherwise 

Equation  5  shows  the  skin  deteetion  rules  based  on  NDGRI  and  NDSI.  The  default 
values  for  6;  and  62  are  -0.541  and  -0.062  [8]. 


[ 


1  if  hi  <  <  ^2  ^nd  Cl  <  y  <  C2 

0  otherwise 


(5) 


These  detection  algorithms  have  the  advantage  of  relying  on  the  values  of  eaeh  feature 
independently,  so  the  entire  algorithm  eonsists  two  feature  eomputation  and  four  eomparisons. 
2.4.3  Likelihood  Ratio  Test  Algorithm 

The  likelihood  ratio  test  is  a  hypothesis  testing  methodology  where  the  two  hypothesis 
are  the  pixel  is  skin  (Hi)  or  the  pixel  is  not  skin  (Ho).  The  likelihood  ratio  is  the  ratio  between 
the  estimates  of  the  two  likelihoods  /i  (0)  and  /q  (0),  where  /i  (0)  is  the  distribution  of  the  data 


when  it  is  skin  and  /q  (0)  is  the  distribution  of  the  data  when  it  is  not  skin.  The  likelihood  ratio 
is  defined  as: 
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(6) 


Ae{e) 


/o(0) 


<r] 

Ho 


where  r]  is  the  threshold. 

The  likelihood  ratio  as  designed  in  [6]  assumes  eaeh  distribution  is  a  Gaussian  mixture 
model.  The  Gaussian  mixture  models  deseribe  the  joint  distribution  of  NDSI  and  NDGRI  for 
skin  present  and  skin  absent.  If  the  resulting  hypothesis  is  greater  than  rj,  then  it  is  most  likely 
skin,  otherwise  it  is  not. 

There  are  43  parameters  that  must  be  speeified  per  [6].  The  first  42  are  grouped  in  sets  of 
six  sets  of  [P/  a,  b,  d,  ps,  pg,],  where  i  varies  from  1  to  7.  P,  is  the  prior  probability,  a,,  hi,  and  d, 


are  the  elements  of  the  eovarianee  matrix 


and  ps,  and  pg,  are  the  means  of  the 


NDSI  and  NDGRI  samples  assigned  to  the  distribution.  The  43’^‘*  parameter  is  rj,  whieh  is  used 
in  the  final  eomparison. 

The  likelihood  ratio  is  expressed  in  Eqn.  7,  where  S  is  the  NDSI  value,  and  G  is  the 
NDGRI  value. 

Equation  7  must  be  run  seven  times,  onee  for  eaeh  value  of  i,  resulting  in  Ei  through  E7, 
where  E  is  defined  as  (less  the  subseripts  for  elarity) 

F  =  [-0.5(d(S  -  -  2b(S  -  Ps)(G  -  p^)  +  a{G  -  p^)^]  (7) 

After  all  seven  E  values  are  ealeulated,  the  likelihood  ratio  is  eomputed  with 


R  = 


PiFi  +  P2F2  +  P3F3 


(8) 


P4F4  +  P5E5  +  P(,Fs  +  P7F7 
where  Pi+P2+P3=l  and  P4+P5+P6+P7=l.  If  R  >  then  the  pixel  is  most  likely  skin,  otherwise  it 
is  not  skin. 
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The  default  parameters  used  by  this  system  are  shown  in  Table  1.  The  threshold  ly  is  a 
user  specified  parameter  and  is  used  to  “tweak”  the  system.  However  77  =  0.0226  is  considered 
in  [6]  a  reasonable  default  value  for  the  current  system  configuration. 


Table  1.  Default  parameters  for  the  Likelihood  Ratio  Test  [6] 


Variable  Name 

i=l 

i=2 

i=3 

i=4 

i=5 

i=6 

1=7 

P 

0.0501 

0.4286 

0.5213 

0.2966 

0.2034 

0.3733 

0.1267 

a 

0.0331 

0.0164 

0.0076 

0.0808 

0.0868 

0.1408 

0.0059 

b 

0.0016 

0.0002 

-0.0006 

-0.0111 

-0.0145 

-0.0026 

-0.0006 

d 

0.0161 

0.0153 

0.0078 

0.0202 

0.0174 

0.1981 

0.0138 

gs 

0.7318 

0.7008 

0.5548 

0.3446 

0.0875 

0.2332 

0.8983 

tig 

-0.5921 

-0.3185 

-0.2306 

0.4085 

-0.1872 

-0.0441 

0.0063 

2.5  Near  Infrared  Melanosome  Index 

Another  important  aspect  of  our  system  is  the  ability  to  estimate  melanin  content  of  the 
detected  skin.  While  this  is  not  directly  part  of  identifying  skin,  it  is  useful  in  determining  if  a 
detection  is  possibly  the  subject  desired.  The  Near-Infrared  Melanosome  Index  (NIMI)  makes 
use  of  750  nm  and  1080  nm  images,  and  is  only  applied  to  pixels  that  have  already  been 
identified  as  skin.  As  with  the  NDVI,  this  feature  is  currently  disabled  until  the  addition  of  a  750 
nm  camera. 

The  first  step  in  determining  the  melanin  content  of  skin  is  to  calculate  the  ratio  between 
the  two  wavelengths  as  shown  in  Eqn.  9. 

P750 

N  =  -  (9) 

P1080 

The  second  step  is  to  estimate  the  actual  percentage  of  melanosomes  present.  In  humans 
the  range  of  values  is  0%  to  43%  [1].  The  melanosome  estimate  for  typical  (“standard”)  person 
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can  be  found  by  Eqn.  10,  and  for  a  median  person  Eqn.  11.  Figure  8  shows  the  eorresponding 
lines  overlaid  on  the  range  of  NlMl  values  for  each  melanin  value.  The  solid  line  eorresponds  to 
the  standard  person,  and  the  dashed  line  to  the  median  person. 

Mmd  =  +  A.93N^  -  9.13N^  +  8.80N^  -  4.90N  +  1.40  (10) 

Msp  =  -1.78N^  +  7.37N*  -  12.35N^  +  10.84N^  -  S.SON  +  1.45  (H) 


Figure  8.  The  gray  dots  represent  aetual  melanosome  pereentage  vs.  the  ealeulated  NlMl  value. 
The  dashed  line  represents  the  median  NlMl  value,  and  the  solid  line  is  the  average  NlMl  value. 
The  regression  shown  in  Eqn.  10  eorresponds  to  the  median  value  (dashed  line),  and  Eqn.  11  to 
the  standard  person  (solid  line)  [8]. 

2. 6  Empirical  Line  Method 

There  are  many  faetors  that  ean  affeet  the  image  that  a  eamera  produees.  It  can  vary 
depending  on  lighting,  shadows,  objeets  in  the  image,  or  any  number  of  other  factors.  These  ean 
affeet  different  portions  of  the  speetrum  differently.  The  most  notable  issue  is  atmospheric 
absorption.  Empirical  Fine  Method  (EEM)  is  frequently  used  in  the  remote  sensing  communities 
to  remove  linear  atmospheric  effects  [9]. 
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We  use  a  pair  of  calibration  panels  for  atmospheric  correction,  one  light  and  one  dark. 
Since  we  know  the  exact  reflectance  at  each  panel  at  each  wavelength,  and  we  can  measure  the 
panel  values  in  the  images,  we  can  apply  a  gain  and  offset  to  adjust  the  received  images  to  an 
estimate  of  the  actual  reflectance. 

To  perform  ELM,  we  first  calculate  two  ratios,  d  and  b,  as  shown  in  Eqn.  12  and  Eqn.  13. 
The  variables  Pwhite  and  Pgray  are  the  actual  reflectance  values  of  the  panels  used,  the  defaults  of 
which  are  shown  in  Table  2,  which  shows  the  dependence  on  wavelengths.  The  variables  Lwhite 
and  Lgray  are  the  measured  values  of  the  two  panels  from  the  camera. 


Table  2,  Default  Values  for  the  ELM  calculation  [2]. 


Wavelength 

Pgray 

Pwhite 

Visible 

0.99 

0.075 

750  nm 

0.99 

0.09 

1080  nm 

0.989 

0.108 

1580  nm 

0.987 

0.131 

Pgray  Pwhite 
^gray  ~  ^white 


(12) 


b  = 


^whitePgray  ^gray  Pwhite 


(13) 


Pgray  Pwhite 

Eor  implementation  purposes,  the  values  of  d  and  b  only  need  to  be  calculated  once. 


After  they  have  been  calculated,  they  are  stored  and  used  on  all  subsequent  images.  Equation  14 
shows  the  operation  that  must  be  run  on  every  pixel  of  every  image,  where  L  is  the  measured 


value  of  the  pixel  and  p  is  the  estimated  value  of  the  reflectance  of  that  pixel. 

p  =  (L  —  b)d 


(14) 
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2. 7  UML  Modeling  and  Software  Development 

The  first  step  in  developing  the  eomputing  arehiteeture  is  to  ereate  a  number  of  use  eases. 
After  writing  out  a  number  of  brief  use  eases,  a  general  program  flow  and  funetionality  is 
established,  and  then  basic  sequence  diagrams  are  created.  The  sequence  diagrams  shown  in 
Chapter  3  are  created  using  the  Maintain!  plug-in  for  Eclipse  [10]  and  are  generated 
automatically  during  run  time. 

The  next  step  is  to  create  class  diagrams  to  determine  exactly  where  classes  should  be 
placed  and  what  access  they  should  be  given.  As  with  the  sequence  diagrams,  only  the  final 
versions  are  shown.  The  class  diagrams  shown  are  created  using  eUML2  [11]  to  directly  reverse 
engineering  the  code.  The  final  steps  in  the  development  process  are  testing  and  analysis.  The 
following  discussions  on  UML  design  are  derived  from  [12]. 

2. 7. 1  Use  Cases 

Use  cases  are  text  based  descriptions  of  how  users  interact  with  the  computer  system. 
The  goal  of  use  cases  is  to  provide  the  developer  with  an  idea  of  what  functions  the  system  needs 
to  perform  and  all  required  interactions  with  the  user  in  order  to  complete  its  goal.  They  are 
generally  the  first  step  in  the  development  process  because  this  is  how  the  developer  determines 
what  the  system  actually  is. 

There  are  several  levels  of  use  cases.  The  most  basic  are  brief  use  cases,  which  are 
generally  single  paragraph  summaries  covering  the  main  success  scenario.  Next  are  casual  use 
cases,  which  are  several  paragraphs  covering  various  scenarios.  The  final  level  is  fully  dressed 
use  cases,  which  are  formal  and  can  be  several  pages  long.  Formal  use  cases  cover  every  step 
and  variation  in  detail,  and  contain  all  prerequisites  and  result  guarantees.  Fully  dressed  use 
cases  are  generally  written  out  in  an  outline  form  instead  of  paragraph  form. 
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For  the  purpose  of  our  system,  only  brief  use  eases  are  developed.  Since  this  is  as  much  a 
scientific  as  it  is  end  user  piece  of  software,  anything  off  the  main  success  path  is  usually 
considered  unacceptable  and  will  cause  the  program  to  end.  Furthermore  the  exact  dialog 
between  user  and  system  is  not  as  important  as  the  ability  of  the  system  to  perform  some 
function.  For  both  these  reasons,  there  is  little  purpose  in  progressing  beyond  brief  use  cases. 
Additionally,  the  use  cases  have  not  been  included  in  this  document  as  they  do  not  significantly 
add  to  the  readers  understanding  of  the  architecture  post-development. 

2.7.2  Sequence  Diagrams 

After  the  system  functions  have  been  determined,  the  developer  must  determine  how  the 
system  will  perform  its  required  functions.  Sequence  diagrams  are  useful  for  this,  because  they 
allow  the  developer  to  determine  what  classes  and  objects  are  needed,  the  methods  required  in 
those  classes,  and  the  data  required  to  be  passed  to  which  methods.  Post  development,  the 
sequence  diagram  help  the  reader  understand  exactly  what  is  being  done  in  the  program  for  each 
function. 

2. 7. 2  Class  Diagrams 

The  class  diagrams  are  created  from  the  sequence  diagrams.  The  developer  takes  the 
classes  and  methods  deemed  necessary,  and  organizes  them  into  a  useful  structure.  There  is  very 
little  new  information  over  the  sequence  diagrams,  as  this  is  basically  a  direct  derivation. 
However,  class  diagrams  are  a  much  easier  format  to  understand  and  program  the  basic  structure 
from. 
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III.  Methodology 


The  goal  of  this  chapter  is  to  present  a  design  that  can  be  used  to  create  a  skin  detection 
system.  It  describes  the  design  in  detail  so  future  developers  can  modify  and  expand  this  design. 
This  chapter  discusses  the  methods  and  processes  used  to  design  the  software  architecture.  The 
first  topic  is  the  top  level  design  that  decides  how  the  programs  are  divided.  Then,  the  design  of 
each  program  is  discussed. 

3. 1  Top  Level  Description 

The  capability  of  the  system  is  contained  in  three  distinct  programs.  This  is  due  to  a 
logical  separation  in  the  requirements  for  any  particular  application.  While  in  the  field,  a  user 
needs  to  be  able  to  perform  real  time  processing  and  view  the  raw  data  and  results.  The  user  also 
needs  to  be  able  to  save  multiple  data  streams  of  his  or  her  choosing.  Because  the  field  deployed 
system  needs  to  be  as  compact  and  efficient  as  possible,  a  minimalistic  program  is  needed, 
causing  the  separation  of  these  functions  from  any  others.  This  software  will  be  referred  to  as 
the  Acquisition  Program  for  the  remainder  of  this  thesis. 

Two  other  capabilities  are  desired,  one  to  allow  the  user  to  process  saved  data  from  one 
or  more  previously  saved  streams  and  save  the  result.  The  other  allows  the  user  to  play  back  a 
previously  saved  file.  While  it  would  be  entirely  possible  to  place  both  of  these  functions  in  the 
same  piece  of  software,  it  is  the  opinion  of  the  author  that  the  additional  complexity  is 
detrimental  to  the  portability  to  less  capable  systems  and  to  the  speed  of  the  software.  These 
pieces  of  software  will  be  referred  to  as  the  Processing  Program  and  the  Playback  Program. 
Figure  9  shows  the  interactions  and  ordering  of  these  programs. 
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Figure  9.  Interactions  and  basic  purposes  of  the  three  programs  in  the  architecture 


3.1.1  Video  File  Format 

All  three  programs  must  share  the  same  video  format  in  order  for  the  system  to  work. 
Table  3  shows  the  format  of  the  different  fields  in  the  file  and  their  sizes.  The  size  of  the  data 
field  in  an  image  will  change  depending  on  the  height  and  width  of  the  image,  as  specified  in  the 
header.  The  data  size  shown  is  for  a  640x512  image  (1,310,720  pixels).  Each  pixel  of  the  data  is 
stored  as  a  32-bit  integer,  regardless  of  the  bit  depth,  due  to  limitations  of  Java.  If  the 
coordinates  for  a  panel  are  not  set,  they  are  set  to  ‘-2’. 

Figure  10  shows  an  example  sequence  in  a  video  file.  The  first  item  is  always  the  file 
header.  This  is  the  only  place  in  the  file  this  should  occur.  Following  the  file  header  is  a 
sequence  of  interspersed  images  and  panel  coordinates  (locations  of  the  grey  and  white  panels  as 
required  for  the  Empirical  Fine  Method  algorithm).  When  the  program  is  reading  through  the 
file,  it  will  start  by  reading  a  long  integer  (8  bytes)  from  the  file.  If  it  is  equal  to  ‘-1’  then  the 
item  is  a  set  of  panel  coordinates,  otherwise  it  is  an  image. 
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Table  3.  Data  types  and  their  fields  and  assoeiated  sizes  defined  in  our  video  file  format. 


Data  Type 

Field  Descriptor 

Size  (Bytes) 

File  Header 

Bit  Depth 

4 

Height 

4 

Width 

4 

Dark  Panel  X 

4 

Dark  Panel  Y 

4 

Light  Panel  X 

4 

Light  Panel  Y 

4 

Image 

Frame  Number 

8 

Time  Stamp 

8 

Data 

1310720  (640x480) 

Panel  Coordinates 

8 

Dark  Panel  X 

4 

Dark  Panel  Y 

4 

Light  Panel  X 

4 

Light  Panel  Y 

4 
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Figure  10.  Example  video  file  layout  showing  how  the  data  types  are  interspersed.  The 
beginning  of  the  fide  is  always  a  header,  followed  by  images  with  a  periodie  set  of  panel 
eoordinates  mixed  in. 


This  methodology  of  interspersing  item  types  can  be  easily  extended  to  including  other 
additional  information  into  the  file.  Once  such  possible  upgrade  might  be  to  include  information 
to  highlight  a  skin  detection  region.  Table  4  shows  what  such  an  extension  might  look  like.  All 
three  programs  would  have  to  be  updated  to  deal  with  the  new  item  type. 
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Table  4,  Example  of  another  possible  data  type  that  could  be  added  to  extend  the  functionality 
of  the  video  file  format. 


Data  Type 

Field  Descriptor 

Boundary  Definition 

'-2' 

Center  X 

Center  Y 

Radius 

3.1.2  Parameter  File  Format 

The  parameter  file  is  shared  by  both  the  Acquisition  Program  and  the  Processing 
Program.  Since  the  Playback  Program  does  not  run  algorithms,  there  is  no  need  for  the 
parameter  file  in  that  program. 

The  Parameter  file  is  called  “ParamFile.txt”  and  must  be  placed  in  the  same  folder  that 
the  video  files  are  placed  in.  It  is  arranged  as  a  comma  separated  list,  where  one  variable  is 
placed  on  each  line.  All  variables,  except  those  for  Empirical  Eine  Method  (EEM),  have  defaults 
in  the  program,  but  those  defaults  can  be  overwritten  by  listing  them  in  the  parameter  file.  There 
is  no  particular  order  of  variables  required  by  the  file.  If  a  variable  is  entered  twice  the  second 
entry  will  take  precedence. 

Table  5  provides  a  list  of  available  parameters  and  the  algorithm  they  are  associated  with. 
The  parameters  for  the  rules-based  algorithms  are  shared  by  the  basic  detector,  and  the  two  rules- 
based  detectors.  The  ai  and  a2  parameters  are  the  thresholds  applied  to  Normalized  Difference 
Vegetation  Index  (NDVI),  bi  and  b2  for  Normalized  Difference  Green-Red  Index  (NDGRI),  and 
Cl  and  C2  for  Normalized  Difference  Skin  Index  (NDSI). 

The  Fi  parameters  are  not  defined  in  the  program,  and  therefore  must  be  defined  in  the 
parameter  file  if  the  likelihood  ratio  algorithm  is  to  be  run.  The  Fi  variable  name  must  be 
followed  by  six  numbers,  all  comma  separated.  All  other  variables  should  be  followed  by  only 
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one  parameter.  Comments  can  be  included  in  the  file  by  placing  a  “//”  as  the  variable  name, 
followed  by  the  comment.  Figure  1 1  shows  an  example  parameter  file. 


Table  5.  Loadable  parameters  associated  with  the  algorithms  and  their  default  values. 


Algorithm 

Parameter 

Default  Values 

Rules  Based 

al 

0.004 

a2 

0.503 

bl 

-0.541 

b2 

-0.062 

cl 

0.657 

c2 

0.768 

NIMI 

pi 

-1.78 

P2 

7.37 

p3 

-12.35 

p4 

10.84 

P5 

-5.5 

p6 

1.45 

Likelihood  Ratio 

eta 

0 

FI 

Undefined 

F2 

Undefined 

F3 

Undefined 

F4 

Undefined 

F5 

Undefined 

F6 

Undefined 

F7 

Undefined 

ELM 

rwlSSO 

0.987 

rwlOSO 

0.989 

rw750 

0.99 

rw660 

0.99 

rw540 

0.99 

rw475 

0.99 

rdlSSO 

0.131 

rdlOSO 

0.108 

rd750 

0.09 

rd660 

0.077 

rd540 

0.075 

rd475 

0.073 
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Figure  11,  Example  parameter  file.  The  F,  lines  are  required  for  the  likelihood  ratio  beeause 
there  are  no  defaults  in  the  software.  The  //parameter  is  used  for  eomments. 


3.2  Acquisition  Software  Design 

The  goal  of  this  system  is  to  enable  the  user  to  view  and  save  images  from  the  eamera,  as 
well  as  view  and  save  the  output  of  the  higher  level  algorithms.  All  operations  are  on  real  time 
images  and  must  be  performed  in  real  time. 

3.2.1  Program  Overview 

The  software  will  follow  a  basie  Model-View-Controller  pattern  [12].  This  pattern 
allows  the  Graphieal  User  Interfaee  (GUI)  of  the  program  to  be  separate  from  the  basie  logie  of 
the  system.  It  also  allows  ehanges  to  be  made  in  the  underlying  hardware  without  changing  the 
logic  of  the  software.  This  program  consists  of  two  primary  threads.  The  GUI  controls  all 
asynchronous  operations,  and  the  main  program  loop  within  the  main()  method  controls  all 
synchronous  operations.  Everything  that  occurs  based  on  a  user  input  is  controlled  by  the  GUI, 
and  all  the  continuous  functions  of  the  program,  such  as  receiving  the  next  image  from  each 
camera  and  updating  the  outputs,  is  controlled  by  the  main  loop. 

Another  physical  requirement  of  the  system  is  that  each  camera  is  controlled  by  a 
separate  class.  Therefore,  the  software  has  three  input  classes:  one  for  each  Near  Infrared  (NIR) 
camera  and  one  for  the  color  camera.  A  fourth  input  class  for  the  750  nm  camera  is 
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incorporated  into  the  system,  but  is  disabled.  Based  on  the  functionality  of  the  program 
determined  by  the  use  eases,  there  are  two  output  elasses:  an  output  window,  and  a  video  file 
ereator.  There  are  a  large  number  of  distinet  algorithms  that  ean  be  performed.  Furthermore, 
there  is  another  elass  that  loads  the  parameters  file.  These  elasses  are  all  wrapped  in  the 
DataHandlers  paekage. 

The  eontroller  and  the  DataHandlers  classes  (input,  output,  and  algorithms)  are  given 
aeeess  to  the  model.  The  model  eonsists  of  a  large  number  of  images,  where  eaeh  image 
represents  the  most  recent  image  from  a  particular  camera  or  algorithm  output.  It  also  eontains 
the  parameters  used  by  all  the  algorithms.  When  the  program  is  run,  there  are  two  instanees  of 
the  Model  ereated.  One  eontains  the  raw  images,  and  the  other  eontains  preproeessed  images 
(those  that  have  been  proeessed  using  the  ELM  algorithm).  Furthermore,  there  is  a  Pieture  elass 
that  defines  exactly  what  data  is  contained  in  an  image.  Throughout  the  architeeture,  wherever 
images  are  used,  they  are  instanees  of  the  Pieture  elass. 

The  eontroller  is  responsible  for  translating  the  basie  operations  ealled  from  the  GUI  and 
main  program  loop  into  the  lower  level  logie  required  to  synehronize  the  rest  of  the  program. 

The  CameraAequisition  elass  is  the  eompiled  Matlab  eode.  Each  of  its  methods  work 
direetly  with  the  eameras.  There  are  methods  for  initializing  a  eamera,  triggering  all  the  eameras 
at  once,  getting  the  most  reeent  frame  from  a  eamera,  and  closing  all  the  eameras.  This  class  is 
compiled  using  the  Matlab  Builder  JA,  and  is  wrapped  in  a  separate  .jar  file. 

3.2.2  Sequence  Diagrams 

When  the  program  is  started,  it  must  initialize  each  camera  and  start  the  GUI.  The 
sequenee  diagram  in  Fig.  12  shows  the  method  ealls  required.  As  ean  be  seen,  the  main() 
method  in  AeqMain  first  initializes  the  eontroller,  and  then  the  GUI.  This  is  because  the  GUI 
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needs  to  have  knowledge  of  the  eontroller,  so  it  is  neeessary  to  start  the  eontroller  first.  On 
initialization,  the  controller  loads  the  parameters  file,  then  starts  each  camera.  The  final  two 
function  calls  from  the  main()  method,  run()  and  updateFrameRate(),  are  the  two  calls  repeatedly 
made  from  the  main  program  loop.  The  sequence  diagrams  of  Controller.run()  are  discussed  in 
more  detail  later  in  this  section. 

When  no  algorithms  are  running,  the  sequence  of  calls  made  from  the  Controller.run() 
method  are  shown  in  Fig.  13.  It  triggers  the  cameras  and  gets  the  most  recent  frame  from  each. 
These  frames  are  directly  stored  in  the  model  by  each  camera  class,  the  controller  is  not 
responsible  for  managing  the  pictures  directly. 

When  the  user  first  opens  a  window,  the  sequence  shown  in  Fig.  14  occurs.  The  GUI 
must  retain  a  link  to  the  open  window  in  order  to  close  it  when  the  user  chooses.  The  controller, 
however,  is  responsible  for  making  all  other  method  calls  to  the  output  windows.  The  identifier 
determines  what  should  be  displayed  in  the  window.  If  the  user  opens  a  video  file  creator  instead 
of  a  window,  the  function  call  is  OpenImageFile()  instead  of  OpenlmageViewQ,  but  otherwise 
identical. 

After  a  window  is  open,  the  sequence  of  calls  from  the  Controller.run()  method  changes 
to  that  shown  in  Fig.  15.  The  window  opened  shows  only  the  raw  data  from  one  of  the  cameras, 
with  no  algorithms  applied.  The  only  difference  from  the  sequence  without  the  window  open  is 
the  UpdateO  call.  If  multiple  windows  are  open,  then  this  method  call  is  made  to  each. 
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Figure  12,  Startup  sequence  of  the  Acquisition  Program.  The  Controller,  GUI,  Models,  and 
Cameras  are  started,  and  then  the  program  is  placed  into  a  continuous  loop  updating  the  model 
and  outputs  with  the  most  recent  data. 
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Figure  13,  Controller.RunQ  sequence  with  no  open  outputs. 


IContfoiierl 
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Figure  14,  Sequence  of  calls  from  GUI  to  open  a  Window. 
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Figure  15,  Sequence  of  Controller.Run()  following  the  opening  of  a  window  to  view  raw 
images  from  a  camera.  Notice  the  addition  of  the  final  call,  Window.UpdateO,  that  is  not  present 
in  Fig.  13. 

The  sequence  diagram  in  Fig.  16  shows  the  sequence  of  calls  following  an  update  call 
with  an  open  window  that  is  showing  images  with  the  rules-based  skin  detection  algorithm,  using 
NDGRI,  NDSI,  and  ELM.  This  combination  is  chosen  because  it  requires  running  multiple 
algorithms,  showing  the  dependency  structure. 

Another  important  feature  of  this  sequence  is  the  call  to  the  ELM  algorithm 
(AlgorithmPre).  This  function  call  populates  the  second  instance  of  the  Model  class,  and  is  only 
run  if  an  open  output  requires  it.  Because  windows  have  no  knowledge  of  which  model  they  are 
actually  using,  AlgorithmPre  is  run  directly  from  the  controller  and  given  access  to  both 
instances  of  Model. 

The  final  feature  of  interest  is  the  algorithm  dependencies.  Any  open  output  has 
knowledge  of  which  algorithm  it  relies  on,  and  which  image  it  uses  in  the  model.  If  the  image  is 
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not  updated  in  the  model,  the  window  calls  the  associated  algorithm  (in  this  case, 
AlgorithmRulesNDGRI).  This  algorithm  only  has  knowledge  of  the  image  it  directly  relies  on, 
and  their  associated  algorithms.  This  approach  ensures  that  no  algorithm  or  output  window  has 
to  have  knowledge  of  more  than  its  direct  dependencies,  which  further  avoids  running  the  same 


algorithm  multiple  times  if  multiple  windows  are  open. 


i  SkmI 


iMoti  i»n»iv«nMMp 


OfTwraQ 


tn99*<t.4fn*iMO 


ISxiri  0 1  n,linIM  f  AMifiWW ,  hMN  [V<MkM)  ' 


‘  IS'vrwl  Ihr^MwM  mM) 


^lurj 


a.  iant>ftAtvrtw) 


M-^P<WMeddm«d«0 


t7'  <i«>(k<lvM<n> 
^  l».^Kuii4MMWtniKM) 


;i  :vad  HLryMmM  iMdsO 


^UjnrttWKlj 


Figure  16,  Sequence  diagram  of  Controller.Run()  when  an  output  Window  showing  images 
processed  by  the  rules  based,  NDSI,  NDGRI,  and  ELM  algorithms. 
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Figure  17  shows  the  sequence  to  close  an  open  window  or  file.  This  sequence  is  identical 
regardless  if  the  output  is  a  video  file  creator  or  a  window.  Not  shown  is  that  the  GUI 
immediately  removes  it  from  its  list  of  open  outputs,  but  does  not  close  the  window  thread  itself 
until  500  ms  later  in  order  to  prevent  a  race  condition  between  the  Controller  and  GUI  over  the 
OutputlmageHandler  from  occurring.  Without  this  delay,  the  Controller  may  attempt  to  access  a 
thread  that  no  longer  exists  in  memory. 


IControlletl 


2:void  CloseC^utputlmageHandler  output) 


livoid  actionPerformedCActionEvent  argO) 


3:void  CloseCOutputImageHandler  output) 
. > 


4:  void  actionPerformed(ActionEvent  argO) 


<• 


ICont'rollerl  |GUI$4| 


Figure  17,  Sequence  diagram  showing  the  function  calls  that  occur  when  a  user  selects  to  close 
an  open  window  or  file. 


Another  key  piece  of  functionality  for  this  program  is  the  ability  to  set  points  in  the  scene 
containing  the  light  and  dark  panels,  so  that  the  ratios  required  for  ELM  can  be  generated.  Due 
to  the  requirement  that  the  Processing  Program  be  able  to  run  all  algorithms  on  the  raw  data  but 
with  different  parameters,  the  Acquisition  Program  cannot  save  the  generated  ratios  because 
those  rely  on  parameters.  Instead,  it  must  save  the  exact  coordinates  at  the  moment  the  user  sets 
them. 

Coordinates  for  either  panel  can  be  set  in  any  open  window  viewing  the  raw  data.  Right 
clicking  sets  the  point  containing  the  light  panel,  and  left  clicking  sets  the  dark  panel.  After  the 
points  are  set,  the  location  of  the  light  panel  is  shown  as  a  light  blue  dot,  and  the  dark  panel  as  a 
blue  dot.  These  dots  will  appear  in  every  open  window  showing  uncorrected  images. 
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In  every  open  video  file,  whenever  a  calibration  point  is  changed,  the 
VideoFileCreator.appendO  method  will  write  a  ‘-1  ’  instead  of  the  frame  number  in  the  video  file, 
and  then  write  both  coordinates.  The  program  will  then  write  the  subsequent  picture  from  the 
beginning  as  normal. 

Figure  18  shows  two  sequences  related  to  establishing  and  removing  calibration  points. 
The  first  occurs  when  a  point  is  clicked  on  the  image.  The  second  is  when  the  “clear  points” 
button  on  the  main  GUI  is  clicked. 
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1  ivoid  mouseClicked(Nousedvent  argO) 
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5 ivoid  actionPerformed(ActionEvent  argO) 


8ivoid  actionPerformed(ActionEvent  argO) 


Figure  18.  Sequence  diagram  showing  the  user  setting  points  in  the  scene  representing  a  panel 
(left),  and  clearing  the  points  tfom  the  main  GUI  (right). 


The  final  sequence  diagram  for  the  Acquisition  Program,  Fig.  19,  shows  the  method  calls 
that  must  be  made  when  exiting  the  program.  These  are  required  to  close  the  camera  interfaces 
and  properly  close  out  all  files  and  visual  windows.  If  the  cameras  are  not  shut  down  correctly 
by  the  controller,  then  they  may  not  be  able  to  start  back  up  correctly. 
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IConbf'oHerl 


2:void  Close() 


3:  void  FinishCloseO 


l:void  acl:ionPerformed(ActionEvent  argO) 


4:void  actionPerPormedCActionEvent  argO) 


IConProllerl  |Gui$l| 

Figure  19,  Sequence  diagram  for  shutting  down  the  system  closing. 


3.2.3  Class  Diagrams 

Figure  20  shows  the  final  class  diagram  for  the  Acquisition  Program.  The  layering  in  the 
program  is  visible  here.  The  GUI  package  is  at  the  topmost  level,  with  the  Controller  directly 
between  it  and  the  remaining  classes.  The  DataHandlers  package  is  below  the  Controller,  but 
still  above  the  Model  and  CameraAcquisition  packages,  which  make  up  the  bottommost  layer. 
The  CameraAcquisition  package  is  the  compiled  Matlab  code. 
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Figure  20.  Class  diagram  of  Acquisition  Program.  The  application  follows  the  Model-View- 
Controller  pattern. 


3.2.4  General  Discussion 

While  the  structure  of  the  program  is  fairly  modular  and  flexible,  we  did  not  take  it  as  far 
as  we  could.  For  example,  if  a  later  programmer  chooses  to  add  another  algorithm  they  must  at 
least  touch  part  of  almost  every  class.  While  most  of  the  edits  would  be  straight  forward,  such  as 
adding  another  item  to  a  list,  it  still  requires  detailed  knowledge  of  the  system.  Flowever,  if  we 
chose  to  make  everything  completely  modular,  it  would  have  significantly  increased  the  size  and 
complexity  of  the  program,  slowing  down  its  operation.  Therefore  we  have  tried  to  strike  a 
balance  between  expandability  and  simplicity. 
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3.3  Processing  Software  Design 

The  goal  of  this  program  is  to  allow  the  user  to  generate  video  files  using  different 
algorithms  or  parameters  than  those  used  in  real  time.  For  example,  in  order  to  obtain  the  fastest 
frame  rate  of  aequisition  possible  and  still  be  able  to  generate  all  outputs  later,  the  user  must  save 
the  raw  data  streams  and  avoid  running  algorithms  in  real  time.  However,  they  ean  later  use  this 
program  to  generate  the  required  files.  It  also  allows  the  user  to  tweak  the  parameters,  running 
algorithms  on  the  same  pieees  of  data  multiple  times. 

The  underlying  design  of  this  pieee  of  software  is  very  similar  to  the  Aequisition 
Program.  However,  there  are  three  key  differenees.  The  first  is  that  the  aequisition  is  from  files 
instead  of  eameras  and  sensors.  The  seeond  is  that  there  are  no  options  to  display  streams,  only 
to  save  them.  This  is  primarily  beeause  the  input  from  the  files  and  the  proeessing  will  be  done 
as  fast  as  possible,  not  in  relation  to  the  rate  that  the  images  were  taken.  The  third  primary 
differenee  from  the  Aequisition  Program  is  that  only  one  algorithm  ean  be  run  at  a  time,  and  the 
user  is  responsible  for  seleeting  the  correet  files  as  required  by  that  algorithm. 

3.3.1  Program  Overview 

The  video  files  generated  by  both  the  Aequisition  and  Proeessing  Programs  are  saved  as  a 
sequenee  of  pietures.  There  is  a  header  at  the  beginning  of  the  file,  with  the  dimensions  of  the 
image  and  the  bit  depth,  as  well  as  the  initial  X-Y  eoordinates  of  both  panels  as  required  for 
ELM.  The  rest  of  the  file  is  a  eolleetion  of  pietures.  Eaeh  pieture  eonsists  of  an  array  with  the 
aetual  image,  a  frame  number,  and  a  timestamp  of  when  the  pieture  was  taken.  If  the  frame 
number  read  is  ‘-1’,  the  four  following  integers  are  the  new  X-Y  eoordinates  of  the  panels,  and 
then  the  standard  sequenee  resumes. 
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Since  the  video  files  from  different  inputs  will  almost  invariably  have  been  started  at 
different  times,  they  must  be  synchronized  in  order  to  get  any  useful  results.  The  program  first 
reads  the  pieture  from  both  files,  and  then  steps  through  the  file  that  starts  first  until  the 
timestamp  matehes  that  of  the  other  file.  If  the  input  files  do  not  overlap,  then  the  program  does 
not  produce  an  output  fide  and  warns  the  user. 

3.3.2  Sequence  Diagrams 

This  program  is  eompletely  sequential,  following  a  single  path  from  start  to  finish.  There 
is  an  option  to  exit  when  seleeting  the  algorithm,  however  the  program  must  otherwise  eontinue 
along  the  single  path. 

The  sequence  diagram  shown  in  Fig.  21  illustrates  the  method  calls  that  take  plaee  when 
the  program  is  first  opened.  The  user  must  first  supply  the  program  with  the  output  filename, 
before  the  main  GUI  opens.  The  aetionPerformed()  on  FileHandler$2  is  the  user  clieking  the 
button  to  start  the  file  save.  The  controller  has  one  associated  file  handler,  whieh  is  the  output 
file.  All  input  file  handlers  are  controller  by  the  algorithm.  This  is  one  primary  differenee  from 
the  Aequisition  Program,  where  input,  algorithms,  and  outputs  are  all  associated  directly  with  the 
controller. 
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Figure  21,  Startup  sequence  of  the  Processing  Program.  All  work  for  the  remainder  of  the 
program  is  performed  by  logic  inside  Controller.Update(). 


The  sequence  diagrams  for  the  remainder  of  the  program  are  not  particularly  helpful  in 
understanding  its  operation.  The  program  from  here  on  out  is  a  continuous  loop.  First  it  checks 
to  see  if  an  algorithm  has  been  selected,  then  uncovers  the  file  selection  windows  in  turn,  until  all 
input  files  have  been  selected.  Once  the  last  file  is  selected,  the  program  calls  the  synchronize() 
method,  and  then  the  Algorithm.Run()  method  on  the  selected  algorithm.  Once  the  end  of  either 
input  fide  is  detected,  the  program  closes. 

3.3.3  Class  Diagrams 

One  of  primary  differences  from  the  Acquisition  Program  is  the  replacement  of  the 
Model  class  with  a  Parameters  class.  It  is  basically  identical  to  the  model  class  except  it  does  not 
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contain  any  pictures.  Another  differenee  is  the  FileHandler  class  replaces  the  Camera,  Window, 
and  VideoFileCreator  classes  of  the  Acquisition  Program.  There  is  also  no  CameraAcquisition 
class  as  there  are  no  cameras.  The  class  structure  of  the  Proeessing  Program  is  shown  in  Fig.  22. 
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Figure  22,  Class  diagrams  for  the  Processing  Program.  The  Processing  Program  has  a 
simplified  version  of  the  Aequisition  Program  class  structure. 


3.3.4  General  Discussion 

The  Processing  Program  is  much  more  straightforward,  and  less  elegant  than  the 
Acquisition  Program.  It  also  requires  the  user  to  have  knowledge  of  how  each  processing 
algorithm  works  and  what  information  is  needed,  as  there  are  no  dependencies  built  into  this 
aspect  of  the  system.  For  example,  to  perform  likelihood  ratio  based  skin  detection,  the  user 
must  first  know  to  run  the  NDSI  and  NDGRI  algorithms.  Because  eaeh  video  file  does  not  store 
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what  sort  of  image  is  saved  in  it,  the  user  must  keep  traek  of  what  they  store  in  eaeh  file  (e.g.,  by 
using  filename  eonventions). 

3.4  Playback  Software  Design 

The  Playbaek  Program  is  the  simplest  of  the  three  programs.  Its  only  funetion  is  to  open 
and  display  a  video  fide,  with  basie  functionality  for  video  playback.  These  functions  include  the 
ability  to  play,  pause,  and  move  around  in  the  video.  It  must  also  ensure  that  the  playback  speed 
corresponds  with  the  speed  it  was  recorded.  The  source  code  borrows  heavily  from  the 
FileHandler  class  of  the  Processing  Program  and  the  Window  class  of  the  Acquisition  Program. 

3.4.1  Program  Overview 

There  are  two  options  for  moving  in  the  video.  The  first  is  by  frame  number,  which 
moves  the  current  view  to  that  frame  number,  and  the  second  is  time,  which  searches  the  video 
for  the  closest  timestamp  (since  the  exact  matching  timestamp  might  not  exist).  The  frame 
number  and  time  of  the  current  image  is  displayed  to  the  user  with  each  frame.  In  order  for  the 
video  to  display  at  the  correct  speed,  the  program  looks  at  the  difference  in  time  between  the 
current  frame  and  next  frame  (At).  It  then  displays  the  next  frame  At  after  the  current  frame. 

3.4.2  Sequence  Diagrams 

Like  the  Processing  Program,  this  is  a  one  use  utility.  A  user  cannot  open  the  utility,  and 
then  start  multiple  video  files  from  one  instance.  The  first  sequence  diagram  for  the  Playback 
Program,  Fig.  23,  shows  the  procedure  for  opening  a  video  fide. 
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Figure  23,  Sequence  of  opening  the  Acquisition  Program.  Notice  there  is  a  single  input  file 
opened,  and  no  separate  outputs. 


After  the  user  selects  “play”  to  play  the  selected  video  file,  the  main  program  loop  in  the 
GUI  makes  a  method  call  to  Controller.PlayQ  every  cycle.  This  loop  is  only  broken  when  the 
user  selects  some  other  function,  or  the  end  of  the  file  is  reached.  Figure  24  shows  the  sequence 
of  a  calls  made  before  the  pause  button  is  pressed,  and  after.  When  the  GUI  is  updated,  it  no 
longer  reads  the  most  recent  picture  from  the  file,  it  just  uses  the  current  one  over  and  over. 
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Figure  24,  Sequence  before  and  after  the  “Pause”  button  is  pressed.  Notice  that  no  new  pictures 
are  read  in  the  update  loop  following  the  pause  action. 


The  most  complicated  functions  of  the  program  are  the  ability  to  move  within  a  file.  Due 
to  the  linear  nature  of  the  file  layout,  the  searches  precede  based  off  a  simple  comparison  of 
equality  with  the  search  parameter  provided.  Whenever  either  type  of  search  is  started,  the  file  is 
closed  and  reopened  to  reset  it  to  the  beginning.  If  the  selected  frame  number  is  before  the 
beginning  of  the  file,  the  first  image  in  the  file  is  displayed,  along  with  a  message  informing  the 
user  of  such  and  the  number  of  the  first  frame  from  the  file.  The  method  continues  to  loop  until 
either  a  match  or  the  end  of  the  file  is  found.  The  search  by  frame  number  sequence  is  show  in 
Fig.  25.  Searching  for  a  particular  time  is  virtually  identical  to  searching  for  a  frame  number;  the 
only  logical  difference  is  the  value  being  compared. 
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Figure  25.  Sequence  of  calls  following  a  “Jump”  command.  The  Playback  Program  will  close, 
then  reopen  the  file  and  search  for  a  matching  frame. 


Figure  26  shows  the  sequence  of  method  calls  when  the  exit  button  is  pressed.  Since 
there  are  no  hardware  inputs  or  outputs,  the  only  requirements  are  to  shutdown  the  input  file  and 
close  the  GUI. 
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Figure  26,  Sequence  following  the  exit  command. 


3.4.3  Class  Diagrams 

Figure  27  shows  the  Playback  Program  class  diagram.  The  package  layout  is  identical  to 
the  Processing  Program  and  similar  to  the  Acquisition  Program.  However,  the  DataHandlers 
class  is  significantly  smaller  as  there  is  only  one  input  class,  and  no  output  classes.  Furthermore, 
since  no  algorithms  are  executed,  there  is  no  Model  or  Parameters  class  in  the  Model.  The  final 
important  feature  is  the  dependency  loop  between  the  Controller  and  the  GUI.  This  is  because 
the  GUI  is  not  only  the  asynchronous  input  of  the  program,  it  is  also  the  synchronous  output. 
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Figure  27,  Playback  Program  class  diagram.  Notice  the  significantly  reduced  DataHandlers  and 
Model  packages,  and  the  circular  dependency  between  the  Controller  and  GUI. 


3.4.4  General  Discussion 

The  two  biggest  issues  are  the  circular  dependency  between  the  Controller  and  GUI,  and 
the  extremely  slow  search  time  when  moving  around  in  a  file  (approximately  40  fps).  While  it 
would  be  difficult  to  solve  the  circular  dependency  without  completely  redesigning  the  program, 
the  slow  search  time  has  been  identified  as  an  item  for  future  work.  While  changing  the  design 
of  jumping  to  a  particular  time  would  be  problematic  as  the  time  between  frames  is  not  constant, 
jumping  to  a  frame  number  could  be  changed  to  skip  over  the  required  number  of  bytes  in  the 
file  instead  of  reading  through  (although  keeping  track  of  reference  panel  frames  could  be  an 
issue). 

Another  problem  that  has  been  identified  is  the  slow  playback  when  multiple  instances  of 
the  program  are  running  simultaneously.  This  is  because  the  hard  drives  simply  cannot  keep  up 
with  the  demand.  Assuming  the  hard  drive  switches  between  video  files  exactly  once  per  frame. 
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which  is  a  reasonable  assumption  due  to  the  applications  waiting  period  between  frames  while 
playing,  the  time  to  read  a  frame  on  the  current  system  is  16.83  ms.  This  means  that  only  five 
players  ean  be  running  at  a  time  to  maintain  the  eurrent  frame  rate  of  10  fps.  If  time  advance 
searehes  are  being  performed,  the  hard  drive  attempts  to  switch  between  files  more,  causing  the 
frame  rate  to  deerease.  A  fix  for  this  would  be  to  upgrade  to  signifieantly  faster  hard  drive  setup, 
such  as  10,000  rpm  drives  in  RAID,  or  a  solid  state  drive  setup.  The  potential  read  time  per 
frame  could  be  redueed  to  approximately  4.2  ms  per  frame  with  a  solid  state  drive  on  a  SATA  II 
interfaee,  even  faster  if  used  in  striped  RAID. 

3.5  Summary 

The  construction  of  the  software  is  fairly  straightforward,  exeept  for  the  multithreaded 
aspects  of  the  code.  It  provides  a  reasonably  modular  and  expandable  system  while  maintaining 
a  smaller  program  size.  The  Java  coding  also  allows  one  to  place  the  system  on  any  eomputer 
that  can  access  the  camera  interfaces  through  Matlab,  and  does  not  require  the  eode  to  be 
recompiled  or  significantly  modified. 
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IV.  Testing  and  Analysis 


4.1  Testing  and  Flaws 

The  software  was  designed  using  a  bottom  up  approaeh.  Testing  is  aeeomplished  for  all 
possible  program  branches  occurring  at  each  level.  While  no  single  test  plan  has  been  developed 
and  implemented,  the  structure  has  been  tested  in  data  collections  and  any  errors  found  have 
been  fixed  or  recorded. 

4.1.1  Known  Errors  in  Software 

The  following  sections  describe,  in  detail,  known  errors  with  the  system  as  implemented. 
These  following  errors  are  limited  to  the  Acquisition  Program,  and  include  periodic  output  file 
corruption,  errors  when  opening  or  closing  outputs,  and  an  improper  refresh  of  the  viewing 
window. 

4. 1.1.1  File  Corruption  in  the  Acquisition  Program 

The  most  serious  error  is  a  sequence  of  seemingly  random  numbers  that  are  sometimes 
written  to  a  video  file  by  the  acquisition  program  when  a  file  is  first  created.  This  occurs  in  one 
out  of  every  approximately  ten  files,  and  we  have  been  unable  to  correlate  it  to  any  particular 
event  and  thus  have  been  unable  to  find  the  cause  of  the  error.  The  only  correlation  found  is  that 
it  tends  to  occur  more  frequently  as  the  number  of  open  windows  is  increased.  In  some  cases, 
the  extra  integers  can  be  removed  and  the  rest  of  the  file  copied  into  a  new  file.  However,  most 
of  the  time,  the  files  appear  to  be  irrecoverable.  The  simplest  way  to  check  for  a  corrupted  file  is 
to  open  it  in  the  playback  program,  and  if  instead  of  playing  a  video,  the  screen  says  “waiting  for 
image”,  then  the  file  is  most  likely  corrupted. 


4-1 


4. 1.1. 2  Concurrent  Modification  of  Open  Outputs  in  the  Acquisition  Program 

If  the  Acquisition  Program  is  running  slowly,  it  is  possible  for  a  concurrent  modification 
exception  to  occur.  This  error  occurs  because  the  main  loop  keeps  a  separate  list  of  open 
windows,  copied  once  per  frame  from  the  GUI’s  master  list.  If  the  GUI  closes  a  window  while 
the  controller  is  in  the  middle  of  updating  it,  the  program  will  hang.  A  delay  of  500  ms  to  open 
or  close  a  window  is  added  to  prevent  this  from  occurring.  However,  if  the  frame  rate  drops 
below  2  fps,  this  problem  can  still  occur. 

The  solution  to  the  problem  is  to  better  synchronize  the  threads,  not  just  provide  a  set 
delay.  We  tried  to  use  a  set  of  semaphores  for  thread  synchronization,  however,  they  took  too 
long  and  system  performance  decreased  to  a  point  that  is  was  unable  to  meet  performance 
requirements. 

4. 1.1. 3  Improper  Refresh  in  Window  Display  of  Acquisition  Program 

The  window  output  of  the  acquisition  program  does  not  correctly  refresh  the  blank  space 
at  the  bottom  of  the  window,  in  the  area  around  the  label  and  checkbox,  whenever  other  windows 
are  dragged  overtop  of  it.  We  are  unsure  why  Java  is  not  handling  this  part  of  the  GUI  properly, 
but  it  is  not  a  performance  hindering  flaw  so  we  have  not  investigated  the  error  any  further.  An 
example  of  this  issue  is  shown  in  Fig.  28.  Notice  the  bottom  of  the  window,  there  is  a  streaked 
image  from  another  window  being  dragged  across,  as  seen  inside  the  red  box  in  Fig  28. 
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Figure  28,  Window  showing  the  improper  refresh  error. 

4.1.2  Known  Shortcomings  in  Software 
4. 1.2.1  Reference  Panel  Selection 

Currently,  when  setting  the  location  of  a  white  or  dark  panel  in  the  acquisition  program, 
only  a  single  pixel  is  selected  for  each.  This  should  be  extended  to  include  the  selection  of  a 
rectangular  region  of  pixels.  The  program  should  be  modified  so  the  user  drags  a  rectangle 
across  the  panel,  and  the  program  uses  the  average  of  all  points  in  the  rectangle.  Using  the  mean 
will  result  in  a  cleaner  estimate  of  the  true  panel  properties  as  the  noise  is  averaged  out. 

Another  shortcoming  is  the  inability  to  change  or  set  new  panel  points  in  the  processing 
program.  The  parameters  used  to  calculate  the  ratios  can  be  changed,  but  not  the  locations 
themselves.  Therefore,  if  a  user  saves  the  raw  images  but  does  not  select  the  panels,  or  selects 
the  wrong  location,  there  is  no  way  to  correct  the  mistake  later. 

4. 1.2. 3  Difficulty  Closing  Out  of  the  Processing  and  Playback  Programs 

Difficulty  closing  the  Processing  and  Playback  Programs  is  not  so  much  an  error  as  a 
minor  design  flaw.  In  the  processing  and  playback  programs,  whenever  a  file  selection  window 
is  open,  there  is  no  other  GUI  visible,  and  the  selection  window  has  no  option  to  close.  In  the 
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acquisition  program,  windows  can  be  closed  through  the  main  GUI,  but  in  the  other  programs, 
the  user  must  eontinue  inserting  all  necessary  filenames.  There  should  be  a  cancel  button  added 
that  eancels  everything  the  program  was  doing  and  exit  cleanly. 

4. 1.2. 4  Slow  Jump  Times  in  Playback  Program 

Beeause  the  playback  program  must  read  sequentially  though  a  file  to  find  the  matehing 
frame,  searching  for  a  point  near  the  end  of  the  file  can  be  extremely  time  consuming,  since  even 
moderate  length  video  files  are  very  large.  There  is  basically  no  way  to  avoid  this  when  jumping 
to  a  time  beeause  the  time  between  one  frame  and  the  next  is  not  consistent  and  therefore  cannot 
be  pre-calculated.  However,  when  jumping  to  a  frame  number,  the  current  frame  number  is 
known,  as  is  the  size  of  each  image.  The  program  should  be  ehanged  to  skip  over  the  required 
number  of  bytes  in  the  file  rather  than  reading  them,  tremendously  inereasing  the  speed  of 
jumping  to  a  frame  of  interest. 

4. 1.2. 5  Inefficient  Parameter  Modification 

Currently,  the  only  method  of  editing  parameters  is  to  open  the  associated  text  file,  which 
is  a  comma  separated  list  of  variable  names  and  values.  The  user  ean  edit  the  file,  save  it,  then 
click  the  “Update  Parameters”  button  on  the  main  GUI  of  the  aequisition  program.  This  file  is 
also  used  by  the  post  proeessor,  so  any  changes  are  reflected  there  as  well. 

It  would  be  helpful  in  the  future  to  have  a  dedieated  GUI  window  to  edit  parameters, 
where  changes  are  instantly  reflected  in  the  outputs.  This  approaeh  has  the  distinct  advantage 
would  deerease  the  direct  knowledge  of  the  system  and  algorithms  that  the  user  would  need  to 
know.  Furthermore,  it  would  be  faster  to  make  parameter  ehanges  for  the  algorithms. 
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4. 1.2. 6  RGB  Camera  Issues 


One  major  limitation  to  the  system  is  the  color  camera.  First,  this  camera  is  extremely 
slow.  It  takes  approximately  87  ms  to  acquire  each  frame,  as  discussed  in  Section  4.3.1.  The 
longer  time  compared  with  the  NIR  cameras  is  not  unexpected,  as  the  RGB  camera  acquires 
eight  times  the  amount  of  data  as  each  of  the  NIR  cameras  (24  bit  per  pixel  versus  12  bits  per 
pixel,  and  four  times  the  spatial  resolution).  Lowering  the  resolution  of  the  camera  could  speed 
this  up  significantly,  but  this  function  only  exists  in  the  code  provided  by  the  manufacturer  and 
not  in  the  Matlab  camera  interface. 

The  other  problem  with  this  particular  camera  is  that  it  will  only  work  on  a  32-bit 
computer.  All  other  parts  of  this  system  will  function  on  the  newer  64-bit  computers.  This  is,  in 
part,  responsible  for  the  reduced  algorithm  performance  and  reduced  acquisition  frame  rates. 
System  performance  could  be  significantly  improved  if  this  camera  was  upgraded. 


4.2  System  Demonstration 

The  system  is  demonstrated  using  the  parameters  specified  below  in  Table  6  and  Table  7. 
Table  6  shows  the  F,  parameters,  which  each  have  six  arguments,  used  by  the  Likelihood  Ratio 
Test,  while  Table  7  shows  all  the  single  argument  variables  used  by  all  the  algorithms. 


Table  6.  Likelihood  Ratio  Test  parameters  used  in  demonstration  of  the  programs. 


Variable  Name 

i=l 

i=2 

i=3 

i=4 

i=5 

i=6 

i=7 

P 

0.0501 

0.4286 

0.5213 

0.2966 

0.2034 

0.3733 

0.1267 

a 

0.0331 

0.0164 

0.0076 

0.0808 

0.0868 

0.1408 

0.0059 

b 

0.0016 

0.0002 

-0.0006 

-0.0111 

-0.0145 

-0.0026 

-0.0006 

d 

0.0161 

0.0153 

0.0078 

0.0202 

0.0174 

0.1981 

0.0138 

qs 

0.7318 

0.7008 

0.5548 

0.3446 

0.0875 

0.2332 

0.8983 

qg 

-0.5921 

-0.3185 

-0.2306 

0.4085 

-0.1872 

-0.0441 

0.0063 
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Table  7.  Basic  parameters  used  in  the  demonstration  of  the  programs. 


Algorithm 

Parameter 

Default  Values 

Rules  Based 

bl 

-1 

b2 

-0.15 

cl 

0.5 

c2 

1 

Likelihood  Ratio 

eta 

0.028 

ELM 

rwl580 

0.987 

rwlOSO 

0.989 

rw750 

0.99 

rwRed 

0.99 

rwGrn 

0.99 

rwBlu 

0.99 

rdl580 

0.131 

rdl080 

0.108 

rd750 

0.09 

rdRed 

0.077 

rdGrn 

0.075 

rdBlu 

0.073 

4.2.1  Acquisition  Program 

Figure  29  contains  a  screenshot  of  the  main  GUI  for  the  program.  The  disabled  rows  of 
eheek  boxes  rely  on  the  750  nm  camera.  Opening  and  closing  windows  and  video  files  is 
aecomplished  with  the  cheek  boxes  on  the  GUI.  The  left  two  columns  are  for  viewing  and 
saving  raw  images  from  the  cameras.  The  right  two  columns  are  for  viewing  and  saving  images 
that  have  been  processed  through  the  ELM  algorithm.  The  “RGB”  row  does  not  have  a  save 
option  beeause  it  requires  three  images  at  onee,  whieh  is  not  possible  using  the  current  video  file 
format.  However,  the  three  colors  can  each  be  saved  separately,  allowing  one  to  have  access  to 
the  full  eolor  data  for  post  proeessing  for  future  algorithm  exploration  in  other  applieations  (i.e. 
Matlab). 
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Figure  29,  Acquisition  Program  main  GUI.  The  disabled  rows  rely  on  the  750  nm  camera  not 
eurrently  present  in  the  system. 

One  of  the  most  important  requirements  of  the  program  is  to  be  able  to  correctly 
implement  the  algorithms.  Figures  30,  31,  and  32  show  the  outputs  of  the  Basic  Skin  Detection, 
Rules  Based,  and  Likelihood  Ratio  Algorithms  respeetively.  The  assoeiated  color  image  is 
shown  in  Fig  33.  The  background  of  the  scene  contains  a  large  amount  of  snow.  Snow  is  a 
known  skin  confuser  and  provides  a  reasonable  demonstration  of  the  deteetion  portion  of  the 
system.  The  goal  of  this  set  of  images  is  to  qualitatively  show  the  algorithms  are  working  as 
designed.  All  three  of  the  algorithms  are  being  run  on  images  that  have  been  through  the  ELM 
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algorithm.  Both  the  grey  and  white  panels  used  for  ealibration  are  visible  on  either  side  of  the 
subject  of  Fig.  33. 

The  result  of  the  basic  skin  detection  algorithm  on  the  scene  is  shown  in  Fig.  30,  where 
one  clearly  sees  that  the  Basic  Skin  Detection  algorithm  is  working.  All  exposed  skin  is  clearly 
between  the  thresholds  defined  as  c;  and  C2  in  Table  7.  There  are  some  highlighted  shadows  and 
edges  visible,  but  most  important  is  the  large  amount  of  snow  behind  the  subject  that  is  declared 
as  skin.  However,  this  is  expected  from  this  algorithm,  showing  that  it  is  functioning  correctly. 

Figure  3 1  shows  the  result  from  the  rules  based  algorithm  and  the  parameters  shown  in 
Table  7.  The  most  prominent  difference  between  the  basic  skin  detection  and  the  rules  based 
detection  is  that  the  latter  does  not  declare  the  snow  as  skin,  resulting  in  the  black  background. 
Also  notable  is  the  reduced  highlighting  in  shadows  and  edges,  although  the  tripods  and  feet  still 
have  some  minor  edge  effects.  Another  point  is  that  shadowed  skin  is  not  fully  highlighted. 

Figure  32  shows  the  Likelihood  Ratio  algorithm.  The  algorithm  is  clearly  working,  the 
skin  is  highlighted  more  than  the  surrounding  environment.  We  were  unable  to  achieve  better 
results  by  simply  adjusting  the  value  of  p.  While  there  is  reduced  highlighting  of  shadows  and 
edges  over  the  basic  detector,  there  are  still  many  false  detections  from  the  snow.  Furthermore, 
the  algorithm  does  not  detect  skin  as  well  as  the  rules  based  detector,  particularly  on  the  face. 
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Enable  Crosshairs  Q 


Figure  30,  Output  of  the  basic  skin  detection 
(NDSI  only)  algorithm  and  ELM.  Notice  the 
large  number  of  snow  pixels  that  are  declared  as 
skin. 


Figure  32,  Output  of  the  likelihood  ratio  test 
algorithm  and  ELM.  Notice  that  snow  is  still 
declared  as  skin,  shadowed  regions  of  skin  are 
not  identified,  and  the  comers  of  the  image  are 
also  declared  as  skin. 


Enable  Crosshairs 


Figure  31,  Output  of  the  mles  based  (NDGRI 
and  NDSI)  Detection  Algorithm  and  ELM. 
Notice  that  the  snow  is  not  declared  as  skin  in 
this  image. 


Figure  33,  Color  image  of  the  test  scene.  The 
panels  used  for  ELM  are  visible  beside  the 
subject. 
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Figures  34  through  37  show  the  operation  of  the  ELM  algorithm.  Figure  34  shows  the  raw  color 
image  from  the  RGB  camera.  Looking  closely  reveals  a  light  blue  dot  on  the  white  panel,  and  a  dark 
blue  dot  on  the  dark  panel  (these  are  also  visible  in  Fig.  33).  The  dots  represent  the  coordinates  from 
which  the  ELM  ratio  computations  in  Eqn.  12  and  Eqn.  13  are  computed.  Figure  35  shows  the  ELM 
corrected  color  image.  Also  notice  that  the  dots  do  not  appear  in  the  corrected  images  (the  points  also 
cannot  be  set  in  the  corrected  image  windows). 

Another  feature  of  importance  is  power  thresholding  based  off  the  1080  nm  camera.  Whenever  a 
pixel  is  below  a  threshold  in  the  1080  nm  image,  the  same  pixel  in  all  the  other  images  are  set  to  black, 
to  help  avoid  false  detections  due  to  noise.  The  1080  nm  camera  was  chosen  because  skin  should  be 
lighter  than  the  surroundings  at  this  wavelength,  so  dark  regions  should  not  be  skin,  and  thus  can  be 
safely  eliminated  from  consideration.  This  is  responsible  for  the  black  regions  in  the  upper  comers  or 
Fig.  35. 

Figures  36  and  37  show  images  from  the  same  scene  as  in  Fig.  34  but  the  images  are  from  the 
1080  nm  camera.  Notice  that  while  the  ELM  correction  is  not  extremely  visible  on  the  color  images, 
where  it  is  just  a  slight  color  shift,  the  difference  between  corrected  and  uncorrected  images  from  the 
1080  nm  NIR  camera  is  significant. 

The  final  demonstration  of  the  Acquisition  Program  is  the  ability  to  save  video  streams.  The 
tmest  demonstration  of  this  is  to  be  able  to  use  them  correctly  in  the  Processing  and  Playback  Programs, 
as  is  shown  in  following  sections.  Figure  38  and  39  show  the  dialog  for  video  file  creation.  The  first 
box.  Fig.  38,  appears  when  the  associated  checkbox  is  checked,  but  the  file  only  begins  to  save  after  the 
“Save”  button  is  clicked.  Once  the  save  is  started,  the  second  box  shown  in  Fig.  39  remains  visible  until 
the  save  is  ended  by  un-checking  the  box  in  the  GET. 
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Figure  34,  Raw  color  image. 


Figure  35,  ELM  corrected  color  image. 


Figure  36,  Raw  1080  nm  image. 


Figure  37,  ELM  correction  1080  nm  image. 


Enter  Filename; 
C:\VideoFiles\ 


.vid 


Save 


Figure  39,  Window  visible  during  fde 
creation. 


Figure  38,  File  selection  window. 
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4.2.2  Processing  Program 

The  Processing  Program  is  slightly  more  difficult  to  demonstrate  the  correctness  for  when 
compared  to  the  acquisition  program,  as  there  is  no  visible  output.  All  outputs  must  be  checked 
in  the  playback  program  after  they  are  generated,  and  compared  against  results  saved  from  the 
acquisition  program. 

Figures  40,  41,  and  42  are  screenshots  showing  the  program  flow.  The  input  file  window 
shown  in  Fig.  42  may  be  shown  multiple  times  depending  on  the  algorithm.  The  required 
frequency  or  image  type  for  each  input  is  shown  in  the  title  bar  of  the  window. 


B  Select  Filename  for  Output 

Enter  Filename: 

C:\V1deoFiles\ 

lOutputFileName 

1  .vid 

Save 


Figure  40,  File  selection  window  for  output  file. 


NDSI 

Run 

Exit 

Figure  41,  Algorithm  selection  window. 


Figure  42,  File  Selection  Window  for  input  file.  The  file  type  is  specified  in  the  title  bar. 
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4.2.3  Playback  Program 

While  this  is  the  simplest  of  the  programs,  it  is  still  vitally  important.  It  is  also  eurrently 
the  only  method  of  eonfirming  the  eorrectness  of  the  video  ereation  funetionality  of  the 
aequisition  program,  and  the  eorreetness  of  the  post  proeessor. 

To  help  show  the  functionality  of  this  program,  and  of  the  post  processor,  we  have 
chosen  to  show  both  the  direct  NDSI  and  NDGRI  outputs  in  the  player  window.  These  files  can 
only  be  generated  by  the  post  processor,  because  the  acquisition  program  can  only  save  and 
display  either  direct  data  or  the  output  of  a  higher  level  algorithm. 

Figure  43  shows  the  first  window  presented  to  the  user,  allowing  them  to  select  the  video 
file.  Figures  44  and  45  show  the  player  window  itself  The  scene  in  Fig. 44  and  Fig.  45  is  the 
same  as  that  used  in  Figs.  30-37. 

Figure  44  shows  the  NDSI  results  from  1580  nm  and  1080  nm  video  files.  Both  inputs 
were  previously  run  through  ELM.  Both  skin  and  the  snow  are  clearly  visible  as  brighter  white 
than  all  of  the  surroundings,  as  expected. 

Figure  45  shows  the  NDGRI  results  from  660  nm  and  540  nm  videos,  run  through  ELM 
first.  Skin  appears  darker  than  the  surroundings,  representing  negative  values.  The  snow  in  the 
scene  is  still  bright.  One  of  the  issues  with  NDGRI  is  immediately  visible  from  the  figure. 
NDGRI  tends  to  have  a  much  lower  contrast  than  NDSI,  making  it  harder  to  discriminate 
features  than  with  the  NDSI. 


Figure  43,  File  selection  window  of  Playback  Program. 
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O  NIR  Skin  Detection  Video  Playback  Program 


BEE 


Time:  2  sec 

Frame  P-.  1860 

Jump  to  Frame  tt:  |1 860 

I  I  Go  I 

E 

1  EX“  1 

Jump  to  Time  :  I 

m  I  Go  I 

Jump  to  Frame  1860 

Figure  44,  Main  GUI  of  Playback  Program  showing  an  image  of  NDSI  values.  Both  skin  and 
the  snow  in  the  baekground  have  high  values. 


Time:  6  sec 

Frame  #:  1860 

Jump  to  Frame  #:  1l860 

□ 

B 

1  1 

Jump  to  Time:  I 

□ 

Jump  to  Frame  1860 

Figure  45,  Main  GUI  of  Playback  Program  showing  an  image  of  NDGRI  values.  Skin  has  a 
lower  value  than  most  of  the  surrounding  environment,  while  the  snow  has  a  higher  value. 
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4.3  Performance  Analysis 
4.3.1  Camera  Performance 

In  order  to  analyze  the  performanee  of  the  eameras,  we  reeorded  the  time  to  aequire  a 
frame  from  a  eamera  during  standard  runtime,  with  no  algorithms  being  run.  We  averaged  the 
times  over  1000  aequisitions,  as  shown  in  Table  8.  Why  the  differenee  between  the  aequisition 
times  of  the  1080  nm  and  the  1580  nm  is  not  known  for  eertain,  it  is  possible  that  the  because  all 
three  cameras  are  triggered  immediately  before  the  1580  nm  cameras  images  is  read  it  may  have 
to  wait  some  additional  time  for  the  image  to  be  ready,  which  is  not  the  case  for  the  1080  nm 
camera.  It  is,  however,  clear  that  the  color  camera  is  significantly  slower  than  the  NIR  cameras. 


Table  8.  Acquisition  time  per  image  for  each  camera. 


Camera 

Acquisition  Time  (ms) 

1580  nm 

27.8 

1080  nm 

18.0 

RGB 

87.8 

4.3.2  Algorithm  Performance 

In  order  to  compare  the  performance  of  each  algorithm,  we  have  compared  the  number  of 
mathematical  operations  in  each.  The  mathematical  analysis  are  based  on  the  “as  written”  math, 
and  do  not  take  into  account  any  optimizations  or  additional  operations  added  at  compile  or 
runtime. 

As  a  secondary  comparison  of  efficiency,  we  analyze  the  length  of  time  it  takes  to 
perform  each  algorithm,  averaged  over  1000  runs.  While  these  times  are  not  universal  and  will 
change  depending  on  the  machine  the  program  is  run  on,  and  the  amount  of  load  on  the  system  at 
that  particular  time,  they  do  provide  a  relative  comparison  of  one  algorithm  to  another. 
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As  the  program  is  written,  there  are  also  steps  outside  of  the  algorithm  itself,  such  as 
copying  images  in  memory,  calculating  new  times  and  frame  numbers,  etc.  The  ELM  in 
particular  must  perform  a  deep  copy  of  all  the  raw  images,  which  is  very  time  consuming  but 
does  not  appear  in  the  mathematical  analysis.  The  mathematical  operations  only  take  into 
account  the  operations  that  are  done  on  each  and  every  pixel. 

The  times  shown  in  Column  2  of  Table  9  are  the  times  required  only  to  run  that  particular 
algorithm,  and  not  the  sub-algorithms  prior  to  it.  The  estimated  total  times  shown  in  Column  3 
take  into  account  all  prerequisite  algorithms,  including  the  optional  ELM.  Neither  of  these  times 
takes  into  account  the  acquisition  time  of  the  cameras,  or  the  output  times,  or  any  other  program 
function.  They  are  purely  the  times  required  to  run  that  particular  algorithm  on  one  image. 


Table  9.  Mathematical  Complexity  of  the  algorithms,  and  the  time  to  complete  each  algorithm, 
as  run  on  one  particular  machine.  The  *  indicates  that  the  first  value  is  as  implemented,  and  the 
second  value  is  if  the  750  nm  camera  is  added  to  the  system. 


Algorithm 

Time  (ms) 

Total  Time  (ms) 

Addition 

Muitipiy 

Divide 

Compare 

Power 

Prerequisites 

NDSI 

6.44 

39.27 

2 

1 

{ELM} 

NDGRI 

8.60 

41.43 

2 

1 

(ELM) 

NDVI 

N/A 

N/A 

2 

1 

(ELM) 

NIMI 

N/A 

N/A 

5 

8 

1 

4 

(ELM) 

BasicSD 

7.09 

46.36 

2 

NDSI 

RulesBased 

4.24 

52.11 

4 

NDSI,  NDGRI/NDVI 

LikeRatio 

53.39 

101.26 

68 

84 

1 

1 

1 

NDSI,  NDGRI 

ELM 

32.83 

32.83 

5/6* 

5/6* 

10/12* 

4. 3. 2.1  NDSI 

The  NDSI  algorithm  takes  2  adds  and  1  division  per  pixel.  The  total  time  to  perform  the 
algorithm  is  6.44  ms. 
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4.3.2.2  NDGRl 


The  NDGRl  algorithm  takes  2  adds  and  1  division  per  pixel.  The  total  time  to  perform 
the  algorithm  is  8.60  ms.  This  time  is  somewhat  odd  beeause  the  algorithm  is  eompletely 
identieal  to  NDSI  exeept  for  the  images  from  the  model  that  are  used.  However,  we  ran  the 
experiment  several  times  with  similar  results,  so  it  is  not  a  fluke,  but  we  have  no  explanation  for 
the  extra  2  ms. 

4. 3. 2. 3  NDVI 

Beeause  there  is  no  750nm  eamera,  we  eannot  run  the  NDVI  algorithm  to  determine  the 
time  it  takes.  However,  it  is  virtually  identieal  to  the  NDSI  and  NDGRl  mathematieally,  so  it 
would  take  about  the  same  length  of  time.  It  also  takes  2  adds  and  1  division  per  pixel. 

4. 3. 2. 4  NIMI 

As  with  the  NDVI,  the  NIMI  algorithm  eannot  be  run  until  the  750  nm  eamera  is 
eonneeted  to  the  system.  Mathematieally,  this  algorithm  required  8  multiplications,  1  division,  5 
additions,  and  4  calls  to  Math.pow(),  which  raises  a  number  to  a  power.  Without  visibility  into 
this  function,  we  do  not  know  the  runtime  of  this  operation. 

4. 3. 2. 5  Basic  Skin  Detection 

Basic  skin  detection  requires  that  NDSI  algorithm  also  be  run,  and  then  takes  an 
additional  2  comparisons  per  pixel.  The  additional  time  required  to  run  is  7.09  ms.  This  makes 
the  total  start  to  finish  time  for  the  algorithm  approximately  13.53  ms. 

4. 3. 2. 6  Rules  Based  A  Igorithm 

There  are  two  virtually  identical  algorithms,  the  Rules  Based  Detector  with  NDGRl,  and 
Rules  Based  Detector  with  NDVI.  Because  the  750  nm  camera  is  not  operational,  the  NDVI 
version  is  disabled,  and  this  is  not  analyzed  here. 
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This  rules  based  algorithm  requires  that  NDSI  and  either  NDGRI  or  NDVI  both  be  run, 
and  then  takes  an  additional  4  eomparisons  per  pixel.  The  additional  time  required  to  run  is  4.24 
ms,  as  shown  in  Table  4.  This  makes  the  total  start  to  finish  time  for  the  algorithm 
approximately  19.28  ms. 

Another  point  that  is  unelear  is  why  this  algorithm  is  almost  3  ms  faster  than  the  Basic 
Skin  Detection  algorithm.  The  two  algorithms  are  nearly  identical  except  that  the  Rules  Based 
algorithm  must  make  more  comparisons,  using  two  images  as  input  instead  of  one.  If  anything  it 
should  be  slower,  but  multiple  tests  still  show  it  as  faster. 

4. 3. 2. 7  Likelihood  Ratio  Test 

The  Likelihood  Ratio  test  algorithm  requires  that  NDSI  and  NDGRI  both  be  run,  and 
then  takes  an  additional  70  additions,  84  Multiplies,  3  divisions,  1  exponential,  and  1  comparison 
per  pixel.  The  additional  time  required  to  run  is  53.39  ms,  as  shown  in  Table  4.  This  makes  the 
total  start  to  finish  time  for  the  algorithm  approximately  68.43  ms.  If  it  is  combined  with  ELM, 
the  runtime  is  over  100  ms,  resulting  in  a  dramatic  depreciation  of  system  performance. 

43.2.8  Empirical  Line  Method 

As  has  been  discussed  elsewhere  in  the  thesis,  the  ELM  algorithm  is  somewhat  different 
than  the  other  algorithms.  When  it  is  run,  it  generates  an  entire  second  model,  not  just  a  new 
image.  This  algorithm  requires  1  addition,  1  multiplication,  and  2  comparisons  for  each 
wavelength.  The  current  system  has  five  input  frequencies,  so  the  total  cost  is  5  additions,  5 
multiplications,  and  10  comparisons.  If  the  750  nm  camera  is  added,  the  cost  will  be  6  additions, 
6  multiplications,  and  12  comparisons.  The  total  computational  cost  is  32.83  ms,  with  an 
estimated  39.4  with  the  750  nm  camera.  This  algorithm  does  not  depend  on  any  other 
algorithms,  but  will  be  run  if  the  user  selects  to  run  any  other  algorithms  on  its  output. 
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4.2.3  Algorithm  Scalability 

All  of  the  algorithms  implemented  run  on  0(n)  complexity,  where  n  is  the  number  of 
pixels.  Therefore,  doubling  each  dimension  of  the  camera  resolution  will  cause  a  four-fold 
increase  is  runtime.  Figure  46  shows  the  estimated  runtime  for  various  common  image 
dimensions,  with  the  Rules  Based  Detector  and  the  Likelihood  Ratio  detector,  both  using  ELM. 
These  are  estimated  times,  not  results  from  actual  experiments. 

If  we  assume  that  the  acquisition  and  output  times  are  linear;  meaning  the  operation  of 
the  entire  system  will  scale  proportional  to  the  number  of  pixels  in  the  images,  we  can  get  a 
rough  estimate  of  the  effect  larger  images  have  on  system  performance.  For  example,  the  current 
system  runs  at  approximately  4  fps  with  the  rules-based  algorithm,  and  2  fps  with  the  likelihood 
ratio  test  algorithm.  If  the  camera  resolution  is  upgraded  to  2048x2048,  without  upgrading  the 
computer  hardware  being  used,  the  frame  rates  will  drop  to  approximately  0.29  fps  and  0.14  fps, 
respectively,  which  is  far  below  desired  performance. 


Camera  Resolution 


Figure  46,  Estimated  algorithm  run  times  for  various  common  camera  dimensions.  Both 
algorithms  are  running  EEM  on  all  data  first. 
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V.  Discussion 


5.1  Design  and  Methodology 

The  system  is  designed  with  three  distinet  programs.  The  first  is  a  versatile  Aequisition 
Program  that  ean  aequires  images  and  view,  save,  and  run  algorithms  on  them  in  real  time.  The 
next  program,  the  Proeessing  Program,  ean  run  algorithms  and  generate  new  videos  based  off  of 
saved  video  streams  from  the  Aequisition  Program.  The  final  program  is  a  Playbaek  utility  for 
the  video  files. 

All  programs  were  generated  using  a  bottom  up  approaeh,  and  using  UML  modeling 
praetiees.  Basie  use  eases  were  developed  for  eaeh  program,  then  sequenee  diagrams,  elass 
diagrams,  and  finally  the  programs  themselves.  The  bottom-up  approaeh  was  used  as  it  was 
neeessary  to  determine  exaetly  what  was  required  to  aeeess  the  eameras  and  the  shape  of  the  data 
they  returned,  and  build  the  rest  of  the  system  around  that  information.  All  three  programs  are 
modeled  after  the  Model- View-Controller  pattern.  However,  the  programs  are  all  multithreaded 
so  the  pattern  is  not  strietly  adhered  to. 

The  aequisition  program  was  built  first,  with  the  other  two  programs  adhering  to  the 
standards  and  limitations  it  ereated.  Testing  was  performed  through  usage.  By  using  the 
program  in  data  eolleetions,  whenever  a  bug  was  found,  it  was  dealt  with.  There  are  still  a  few 
known  minor  errors,  listed  in  Seetion  4.1. 

5.2  Results  and  Performance 

The  algorithm  outputs  are  funetioning  as  expeeted,  and  as  shown  in  the  figures  in  Seetion 
4.2.  Algorithm  run  times  and  eomplexity  are  shown  in  Seetion  4.3,  and  are  reasonably  elose  to 
expeetations.  Empirieal  Line  Method  (ELM)  and  the  Likelihood  Ratio  Test  are  eomputationally 
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intensive  and  both  take  a  long  time,  whereas  all  the  feature  ealeulations  and  Rules  Based 
algorithms  are  mueh  faster. 

On  the  eurrent  maehine,  the  frame  rate  is  approximately  10  frames  per  seeond  (fps)  when 
under  light  load,  with  a  reasonable  load  where  only  a  few  simpler  algorithms  are  run,  the  frame 
rate  drops  to  about  5  fps.  If  all  algorithms  and  multiple  windows  and  video  file  saves  are  in 
operation,  the  system  can  run  as  slow  as  0.5  fps. 

5.3  Future  Work 

Needed  changes  and  fixes  were  noted  in  Section  4.1.  However,  there  are  several  major 
expansions  that  we  have  recognized  the  need  for. 

5. 3. 1  Exporting  Images  to  Matlab 

A  utility  should  be  developed  to  export  individual  images  in  a  format  that  can  be  read  in 
Matlab.  Having  the  image  available  in  Matlab  would  allow  for  numerical  analysis  and 
manipulation  of  the  actual  values  of  the  image,  not  just  as  a  visible  image. 

5.3.2  Addition  of  the  750  nm  camera 

Most  of  the  required  coding  and  algorithms  are  already  present  in  the  program  but 
disabled.  Adding  support  for  the  750  nm  camera  would  allow  for  the  use  of  NDVI,  NIMI,  and 
the  rules  based  detector  with  NDVI. 

5.3.3  Target  Different  Language  Environment 

Java’s  advantage  lies  in  its  flexibility.  Programs  can  be  run  on  multiple  different 
operating  systems  and  hardware  platforms.  However,  the  advantage  is  lost  with  our  camera 
system  because  some  of  the  drivers  are  windows  specific.  Therefore,  if  the  system  as  designed 
were  developed  in  C++  or  another  similar  language,  it  could  run  faster,  potentially  up  to  30  fps. 
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5.3.4  Implement  Imperx  Laptop  Cards 

The  current  National  Instruments  camera  link  cards  use  a  PCI  interface,  and  are  therefore 
limited  to  desktop  machines.  The  Imperx  camera  link  cards  are  an  ExpressCard/54  interface  to 
run  on  a  laptop.  If  a  slightly  different  version  of  the  Matlab  code  is  used  it  would  be  possible  to 
access  these  cards.  Opening  a  selection  window  when  the  acquisition  program  is  run,  or  auto¬ 
detecting  which  cards  are  present  would  be  a  substantial  flexibility  boost  to  the  system. 

5. 3. 5  Use  of  Different  RGB  Camera 

As  discussed  in  Section  4. 1.2. 6,  the  RGB  camera  that  was  used  has  some  issues.  Most 
prominent  are  the  slow  frame  rate,  and  lack  of  64-bit  compatibility.  We  recommend  finding  a 
better  camera  for  this  application. 

5.3.6  Additional  Threads  and  Optimization  of  Synchronous  Program  Elements 

There  are  several  ways  to  optimize  the  efficiency  of  the  Acquisition  Program.  For 
example,  using  multiple  threads  to  pipeline  the  acquisition,  processing,  and  output  of  images,  or 
only  running  NDGRI  on  pixels  identified  by  the  basic  NDSI  based  detector.  Such  a  pipeline 
might  look  like  that  in  Fig.  47,  where  each  box  represents  a  different  thread. 


Figure  47,  Example  of  a  possible  pipeline  implementation  to  increase  speed. 


5.4  Conclusion 

The  architecture  designed  here  allows  a  common  user  to  quickly  and  easily  acquire  data, 
view  raw  and  processed  data,  and  save  detections  in  real  time.  For  the  search  and  rescue 
community,  this  provides  a  unique  mechanism  to  perform  skin  detection  algorithms  in  real  time. 
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It  also  allows  users  to  easily  save,  post  proeess,  and  view  video  files.  The  program  correctly 
implements  all  common  algorithms  required  for  skin  detection,  for  use  both  in  real  time  and  post 
processing.  Furthermore  it  provides  a  platform  suitable  for  future  expansion  in  support  of  a 
critical  Air  Force  research  area:  human  measurement  and  signature  intelligence. 
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Appendix  A.  Software 

A.  1  Software  Installation  and  Setup 

A.  1.1  Required  Software 


Windows  XP  32-bit 

Matlab®  7.9.0.529  (R2009b)  32  bit 

Matlab®  Compiler™  4.11 

Matlab®  Builder™  JA  2.0.4 

Matlab  Component  Runtime  7.11 

Matlab  Image  Proeessing  Toolkit 

Matlab  Image  Aequisition  Toolkit 

Eelipse  IDE  for  Java  Developers  (win32  version) 

Java  JDK6(1.6.0_17) 

Java  SE  Runtime  Environment  (1.6. 01 7) 

SUI  Image  Analysis  Software  and  .ICD  files 
Nl-Imaq 

Thor  Eabs  Software  for  DCx  USB  Cameras  (CD3.32) 

A.  1.2  Software  Setup: 

•  Standard  installation  for  Matlab,  Eelipse,  and  MS  Visual  Studio. 

•  Install  the  Matlab  Component  Runtime  (MCR). 

The  MCR  is  installed  by  running 

C:\Program  Files\MATLAB\R2009b\toolbox\compiler\deploy\win32\ 
MCRInstaller.  exe 

and  following  the  default  installation  instructions. 
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•  Set  up  Windows  and  Matlab  environment  variables 

Open  the  environment  variables  box  in  Windows  and  add  the  following  to  the  PATH 
variable: 

C:\Program  FilesUava  \jdkl.6.0_17;  C:\Program  FilesUava  \jdkl.6.0_17\bin 
Edit  (or  add)  JAVA  HOME  to: 

C:\Program  FUes\Java  \jdkL6.0_17 
Open  Matlab  and  enter  the  following  at  the  command  prompt: 

setenv(‘JAVA_FfOME ’  C:\Program  FUesilava  \jdkL6.0_17’) 

•  Set  the  C++  Compiler  that  the  Matlab  Compiler  will  use 
In  either  the  Matlab  or  DOS  Command  prompt,  type: 

mbuUd  -setup 

Allow  mbuild  to  locate  installed  compilers,  and  select  lee  when  presented  with  options. 
Einally  verify  the  choices  and  the  program  will  exit. 

•  Copy  .ICD  files  for  the  NIR  cameras  to  location  required  by  Nl-Imaq. 

This  location  can  be  found  by  doing  a  windows  search  for  *.icd,  and  you  will  find  one 
directory  full  of  these  files. 

Make  a  second  copy  of  this  file,  and  use  one  for  each  camera  so  that  settings  can  be  saved 
independently. 
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A.2  Explanation  of  Software  Operation 

Functions  are  programmed  into  Matlab,  with  one  funetion  per  m-file.  The  Matlab 
Builder  JA  allows  a  user  to  ereate  any  number  of  elasses,  and  plaee  any  number  of  funetion  m- 
files  into  eaeh.  It  then  wraps  all  the  elasses  and  funetions  into  a  JAR  file. 

In  order  to  ereate  the  JAR  file,  the  Matlab  Builder  JA  makes  use  of  the  Matlab  Compiler 
(mee.exe)  and  the  java  eompiler  (javae.exe).  The  Matlab  Compiler  makes  use  of  a  C++ 
eompiler,  whieh  is  the  reason  for  setting  its  loeation  in  mbuild.exe. 

When  setting  up  the  Java  projeet  in  Eelipse,  both  the  eompiler  generated  JAR  and  another 
file,  JavaBuilder.JAR,  must  be  imported.  JavaBuilder.JAR  eontains  the  neeessary  elasses  for  the 
type  eonversion  between  standard  Java  elasses  and  the  types  required  for  the  Matlab  funetion 
ealls.  The  user  generated  JAR  aetually  eontains  the  eompiled  funetions. 

At  runtime  the  JAR  will  link  the  Java  elasses  that  are  Matlab  funetion  ealls  to  an 
assoeiated  .dll  eontained  in  the  same  direetory  as  the  JAR.  This  .dll  will  link  to  another  .dll  in 
the  Matlab  Runtime  Compiler  to  aetually  run  the  Matlab  funetions. 

A.  3  Instruction  for  Eclipse  Project  Setup 

To  setup  the  Java  projeet  in  eelipse,  add  the  JAR  file  that  was  generated  by  the  Matlab 
Builder  JA.  Also  add  JavabuilderJar  found  in  C:\Program  Piles  (x86)\MATLAB  Compiler 
Runtime\v7 1  l\toolbox\.  Both  should  be  added  as  external  JARs.  Onee  they  are  added  to  the 
projeet,  add  this  line  to  the  top  of  any  java  files  that  use  Matlab;  "Import 
com.mathworks. toolbox javabuilder.*;".  Finally  import  the  paekage  of  the  eompiled  funetions 
as  ""import  {paekage  name}. 
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Appendix  B.  Users  Manual 
B.  1  Acquisition  Program 

•  Camera  Configuration 

Depending  on  the  light  eonditions,  the  operational  settings  of  the  NIR  eameras  have  to  be 
setup.  Open  the  Nl-Imaq  software,  and  select  the  first  channel  camera.  Click  the  “Grab”  button, 
then  change  the  Operational  Setting  and  the  Digital  Gain  parameters  until  the  image  is 
reasonable.  Click  the  “Save”  button.  Repeat  for  the  second  channel,  then  close  the  program. 

•  Startup 

In  Eclipse,  open  AcqMain.Java  in  the  GUI  package  of  the  AcquisitionProgram  project, 
and  select  run.  System  messages  are  displayed  at  the  bottom  of  the  screen.  The  GUI  will  appear 
when  everything  is  running. 

•  Closing 

To  close  the  program,  click  the  “Exit”  button  on  the  bottom  of  the  main  GUI.  This  will 
close  all  open  outputs  in  turn,  and  then  close  the  program.  The  “X”  in  the  upper  right  comer  will 
not  close  the  program. 

•  Viewing  Windows 

The  first  and  third  columns  of  checkboxes  on  the  main  GUI  are  for  the  window  viewers. 
The  first  column  is  to  view  the  raw  data,  and  the  second  to  view  the  EEM  processed  data. 
Selecting  the  checkbox  opens  the  window,  and  unselecting  the  checkbox  closes  it.  The  windows 
cannot  be  closed  using  the  “X”  in  the  upper  right  corner. 

Crosshairs  can  be  overlaid  onto  the  image  by  selecting  the  checkbox  under  the  image. 
These  are  not  stored  in  the  image  in  any  way,  only  displayed  in  the  viewer.  They  can  be  used  for 
lining  up  the  cameras  onto  a  target. 
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•  Video  File  Creation 


The  second  and  fourth  columns  of  checkboxes  on  the  main  GUI  are  for  the  video  file 
creation  windows.  The  second  column  is  for  the  raw  data,  and  the  fourth  to  save  the  ELM 
processed  data.  Selecting  the  checkbox  opens  the  window,  and  unselecting  the  checkbox  closes 
it.  The  windows  cannot  be  closed  using  the  “X”  in  the  upper  right  corner. 

The  first  window  to  appear  is  a  filename  selection  window.  Entering  a  pre-existing 
filename  will  cause  a  message  to  be  displayed.  Entering  a  valid  filename  will  cause  this  window 
to  change  to  a  smaller  window  simply  listing  the  filename,  and  will  start  the  video  file  creation. 
The  file  does  not  begin  acquiring  until  after  a  filename  is  selected.  Closing  the  window  through 
the  main  GUI  ends  the  video  file  creation. 

All  video  files  are  saved  to  “C:A^ideoEiles/”,  and  automatically  are  assigned  a  “.vid” 
extension. 

•  Setting  Panel  Coordinates 

When  the  program  starts  there  are  no  panel  coordinates  set,  and  the  ELM  will  do  nothing. 
Panel  coordinates  can  be  set  in  any  open  raw  image  window  (first  column  of  checkboxes).  They 
cannot  be  set  in  the  ELM  processed  windows. 

The  points  are  set  by  clicking  on  the  panels  in  the  image.  A  left  click  will  set  the  grey 
panel,  and  a  right  click  will  set  the  white  panel.  The  raw  image  displays  will  all  show  a  light 
blue  dot  at  the  coordinates  of  the  white  panel,  and  a  dark  blue  dot  over  the  grey  panel.  As  with 
the  crosshairs,  these  do  not  overwrite  any  data  in  the  image,  they  are  only  overlaid  in  the  display. 
The  ELM  processed  images  do  not  show  these  dots.  There  is  no  requirement  of  the  order  they 
must  be  selected  in. 
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Clearing  the  panel  coordinates  is  done  by  clicking  the  “Clear  Points”  button  in  the  main 

GUI. 

Whatever  panel  coordinates  are  set  when  a  file  save  is  started  are  saved  at  the  beginning 
of  the  file.  Any  subsequent  panel  coordinate  changes  are  reflected  when  they  occur. 

•  Parameter  File  Modification 

The  parameter  file  is  saved  as  “C:WideoFiles/ParamFile.txt”.  Open  the  file  in  a  text 
editor,  make  the  necessary  changes,  then  save  the  file.  The  parameters  in  the  system  are  updated 
only  on  startup  or  when  the  “Update  Parameters”  button  is  clicked,  not  when  the  file  is  saved. 
Available  parameters  and  formatting  is  discussed  in  Section  3.1.2. 

B.2  Processing  Program 

•  Startup 

In  Eclipse,  open  ProcMain.Java  in  the  GUI  package  of  the  ProcessingProgram  project, 
and  select  run.  The  first  window  presented  is  an  output  file  name,  which  must  be  entered  in  order 
to  continue. 

•  Closing 

The  program  closes  automatically  when  finished. 

•  Algorithm  Selection 

After  an  output  filename  is  selected,  a  small  GUI  containing  a  list  of  available  algorithms 
is  presented.  Pull  down  the  list  and  select  the  desired  algorithm,  then  click  run.  Note:  the 
algorithms  have  no  built  in  dependency  structure,  so  be  sure  you  have  created  all  videos  the 
algorithm  will  require  prior  to  attempting  to  run  the  algorithm. 

•  Input  File  Selection  and  Algorithm  Execution 
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After  the  algorithm  is  seleeted,  input  file  selection  windows  will  appear.  The  required 
wavelength  is  listed  in  the  window  title.  Selecting  a  video  containing  data  other  than  that 
required  will  still  produce  an  output.  System  messages  and  progress  are  displayed  in  the  Eclipse 
console  box. 

•  Concerns  for  ELM 

Due  to  the  nature  of  the  Processing  Program,  ELM  can  be  difficult.  The  initial  panel 
coordinates  are  saved  in  the  file  header,  and  are  applied  based  off  the  first  image.  If  the  panels 
are  moved  or  have  too  much  noise  at  that  exact  pixel,  it  may  be  impossible  to  perform  useful 
ELM. 

The  recommendation  is  to  start  all  desired  file  saves,  then  click  on  the  panels  again  so  all 
data  after  that  will  be  known  to  be  useful.  A  better  alternative  is  to  save  the  ELM  processed 
streams  instead  of  the  raw,  although  that  will  result  in  slightly  reduced  frame  rates. 

•  Synchronization 

In  algorithms  requiring  more  than  one  file,  the  images  are  synchronized  by  time  stamp. 
Any  frames  not  overlapping  the  other  fide  are  discarded  in  the  output,  and  if  no  overlapping 
frames  are  found,  then  no  output  is  produced. 

•  Parameter  Pile 

The  parameter  file  is  shared  with  the  Acquisition  Program,  and  is  loaded  at  startup  only. 
The  location  of  the  file  is  “C:A^ideoPiles/ParamPile.txt”.  See  Section  3.1.2  for  details  on 
available  parameters  and  formatting. 
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B.3  Playback  Program 

•  Startup 

In  Eclipse,  open  PlayMain.Java  in  the  GUI  package  of  the  PlaybackProgram  projeet,  and 
seleet  run.  The  first  window  presented  is  an  input  file  name,  which  must  be  entered  in  order  to 
continue.  Following  the  file  selection,  the  main  GUI  immediately  appears,  and  the  video  begins 
to  play. 

•  Closing 

To  close  the  program,  click  the  “Exit”  button  on  the  bottom  of  the  main  GUI.  The  “X”  in 
the  upper  right  corner  will  not  close  the  program. 

•  Pause/Play 

Pausing  and  Playing  is  done  by  elicking  the  button  below  the  image  with  the  play  or 
pause  symbol  on  it,  whieh  alternates  depending  on  the  state.  Current  frame  number  and  time  in 
seconds  since  the  beginning  of  the  file  are  displayed  directly  below  the  image. 

•  Moving  in  the  Video 

Two  options  are  available:  moving  to  a  time  and  moving  to  a  frame  number.  They  are 
clearly  marked  in  the  lower  left  comer  of  the  GUI.  Enter  a  number  into  the  appropriate  box,  and 
click  the  button  next  to  it.  The  video  ehanges  to  the  appropriate  image,  and  the  video  is 
automatieally  paused. 

One  thing  to  keep  in  mind  is  the  move  function  is  slower  the  farther  into  a  file  the  desired 
frame  is,  and  can  take  a  good  length  of  time.  Also  the  frame  number  is  the  number  of  frames 
since  the  Aequisition  Program  was  started  until  that  image  was  taken,  while  the  time  displayed  is 
in  seconds  from  the  beginning  of  the  file.  So,  if  attempting  to  view  the  same  image  in  multiple 
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video  files,  searching  by  frame  number  will  pull  up  the  same  scene  while  searching  by  time  will 
not. 
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